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Nove] Bojjaaeleotideg po l ypeptides in pathogenic mycobacteria 

anH their as diagnostics, vaccines — and — targets for 

f-hamo therapy. 

This invention relates to the novel polynucleotide sequence we 
have designated "GS" which we have identified in pathogenic 
mycobacceria. GS is a pathogenicity island within 8kb of DNA 
comprising a core region of 5.75kb and an adjacent transmissable 
element within 2.25kb. GS is contained within Mycobacterium 
paratuberculosis, Mycobacterium avium subsp. silvaticum and some 
pathogenic isolates of M. avium. Functional portions of the core 
region of GS are also represented by regions with a high degree 
of homology that we have identified in cosmids containing genomic 
DNA from Mycobacterium tuberculosis. 

15 Background to th <» invention 

Mycobacterium tuberculosis (Mtb) is a major cause of global 
diseases of humans as well as animals. Although conventional 
methods of diagnosis including microscopy, culture and skin 
testing exist for the recognition of these diseases, improved 
methods particularly new immunodiagnostics and DNA-based 
detection systems are needed. Drugs used to treat tuberculosis 
are increasingly encountering the problem of resistant organisms. 
New drugs targeted at specific pathogenicity determinants as well 
as new vaccines for the prevention and treatment of tuberculosis 
are required. The importance of Mtb as a global pathogen is 
reflected in the commitment being made to sequencing the entire 
genome of this organism. This has generated a large amount of 
DNA sequence data of genomic DNA within cosmid and other 
libraries. Although the DNA sequence is known in the art, the 
functions of the vast majority of these sequences, the proteins 
they encode, the biological significance of these proteins, and 
the overall relevance and use of these genes and their products 
as diagnostics, vaccines and targets for chemotherapy for 
tuberculous disease, remains entirely unknown. 
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Mycobacterium avium subsp . silvaticum {Mavs) is a pathogenic 
mycobaccerium causing diseases of animals and birds, but it can 
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also affecc humans. Mycobacterium para tuberculosis (Mptb) causes 
chronic inflammation of the intestine in many species of animals 
including primates and can also cause Crohn's disease in humans. 
Mptb is associated with other chronic inflammatory diseases of 

5 humans such as sarcoidosis. Subclinical Mptb infection is 
widespread in domestic livestock and is present in milk from 
infected animals. The organism is more resistant to 
pasteurisation than Mtb and can be conveyed to humans in retail 
milk supplies. Mptb is also present in water supplies, 

10 particularly those contaminated with run-off from heavily grazed 
pastures. Mptb and Mavs contain the insertion elements IS900 and 
IS902 respectively, and these are linked to pathogenicity in 
these organisms. IS900 and IS902 provide convenient highly 
specific multi-copy DNA targets for the sensitive detection of 

15 these organisms using DNA-based methods and for the diagnosis of 
infections in animals and humans. Much improvement is however 
required in the immunodiagnosis of Mptb and Mavs infections in 
animals and humans. Mptb and Mavs are in general, resistant in 
vivo to standard anti- tuberculous drugs. Although substantial 

20 clinical improvements in infections caused by Mptb, such as 
Crohn's disease, may result from treatment of patients with 
combinations of existing drugs such as Rifabutin, Clarithromycin 
or Azithromycin, additional effective drug treatments are 
required. Furthermore, there is an urgent need for effective 

25 vaccines for the prevention and treatment of Mptb and Mavs 
infections in animals and humans based upon the recognition of 
specific pathogenicity determinants. 

Pathogenicity islands are, in general, 7-9kb regions of DNA 
comprising a core domain with multiple ORFs and an adjacent 

30 transmissable element. The transmissable element also encodes 
proteins which may be linked to pathogenicity, such as by 
providing receptors for cellular recognition. Pathogenicity 
islands are envisaged as mobile packages of DNA which, wh-.c c.Ur.y 
enter an organism, assist in bringing about its convert ion i: .c,< 

35 a non-disease-causing to a disease-causing strain. 
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Fig ure 1(a) and (b. shows a linear map of Che pathogenicity 
land GS in «avs (Fig la, and in «pcb (Fig lb) . The ma.n open 
hiding fran.es are illustrated as ORFs A to H. ORFs A Co F are 
Told within Che core region or GS. ORFs G and H are enooaed by 
Che adjacent transmissable elemenC porcion of GS. 

Bisslasis s at "h- kavsatian 

using a DNA-based differencial analysis technology we have 
discovered and characterised a nove! polynucleotrde rn «pcb 
isolates 0022 from a Guernsey cow and 0021 from a reo deer) . 
This po lynudeocide comprises Che gene region we have desrgna ed 
GS GS is found in « P Cb using Che identifier DNA sequences 
q.XO „o 1 and 2 where Che Seq.XD Ko2 is Che -mpUmencary 
^ in No 1 GS is also identified in Mav s . The 

BNA sequence incorporacing Che posicive errand of GS 
rom an isolacetf Kavs comprise ^ 
Che core region of GS and adjacent transsmrssable element rs 
1 Seg.ID NO.3. DNA sequence comprising 4435 bp or the 
given m ieq.- folate of Mpto including 

positive strand of GS obtamec from an isolate of p 

the core region of GS (nucleotides 1614 to 6047 of GS in Mavs) 
the core reg ^ of gs frQm MpCb 1S 

is given in Seq.ID No 4. remain ing portion 

hiahlv (99.4%) homologous to GS in Mavs. 

of the ** ~ Ce ° f GS in ^ 13 rSadilY ° btainabie b L 
p r son sailed in the art using standard laboratory P™re- 

The entire functional DNA sequence including core region and 

Transmtable element of GS in Mptb and Mavs as described above, 

comprise the polynucleotide sequences of the invention. 

There are S open reading frames (ORFs) £ - 
designated GSA, GSB, GSC, GSD, GSE and GSr are encode, y 
design characteristically for a 

core DNA region di * fere nt GC content than the rest 

pathogenicity island, has a GSG and GSH are 
of the microbial genome. Two ORFs des g 
encoded by the transmissable element o. GS whose c 

hl M that of the rest of the mycobacterial genome. The ORF 
resembles that o. the comD i e mentary DNA strand 

GSK comprises two sub-ORFs H x H 2 on tn 
« i i„ teed bv a programmed f rameshif ting site so tnau a s.ngi 
35 linked oy a P^ 1 nuc i e otiae 
polypeptide is translated from the ORF GS«. 
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sequences of Che 8 ORFs in GS and their translations are shown 
in Seq. ID No 5 to Seq.ID No 29 as follows: 

ORF A: Seq- ID No 5 Nucleotides 50 to 427 of GS from Mavs 

Seq. ID No 6 Amino acid sequence encoded by Seq.ID No 
5. 

ORF B: Seq. ID No 7 Nucleotides 772 to 1605 of- GS from Mavs 
Seq. ID No 8 Amino acid sequence encoded by Seq.ID No 
7. 

ORF C: Seq. ID No 9 Nucleotides 1814 to 2845 of GS from Mavs 
Seq. ID No 10 Amino acid sequence encoded by Seq.ID No 
9. 

Seq. ID No 11 Nucleotides 201 to 1232 of GS from Mptb 
Seq. ID No 12 Amino acid sequence encoded by Seq.ID No 
11 

ORF D: Seq. ID No 13 Nucleotides 2785 to 3804 of GS from Mavs 
Seq. ID No 14 Amino acid sequence encoded by Seq.ID No 
13 . 

Seq. ID No 15 Nucleotides 1172 to 2191 of GS from Mptb 
Seq. ID No 16 Amino acid sequence encoded by Seq.ID No 
15. 

ORF E: Seq. ID No 17 Nucleotides 4080 to 4802 of GS from Mavs 
Seq. ID No 18 Amino acid sequence encoded by Seq.ID No 
17. 

Seq. ID No 19 Nucleotides 2467 to 3189 of GS from Mptb 
Seq. ID No 20 Amino acid sequence encoded by Seq.ID No 
19. 

ORF F: Seq. ID No 21 Nucleotides 4947 to 5747 of GS from tfuvs 
Seq. ID No 22 Amino acid sequence encoded by Seq, TP Kc 
21. 

Seq. ID No 23 Nucleotides 3335 to 4135 of GS from ,fptb 
Seq. ID No 24 Amino acid sequence encoded by Seq.ID r?o 
23. 
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ORF G: Seq. ID No 25 Nucleocides 6176 Co 7042 of GS from Mavs 
Seq. ID No 26 Amino acid sequence encoded by 
Seq. ID No 25 . 

ORF H: Seq. ID No 27 Nucleocides 7953 Co 6215 from Mavs. 

ORF H t : Seq. ID No 28 Amino acid sequence encoded by 
nucleocides 7953 Co 7006 of Seq. ID No 27 

ORF H 2 : Seq. ID No 29 Amino acid sequence encoded by 
nucleocides 7009 Co 6215 of Seq. ID No 27 

The polynucleocides in Mtb wich homology co Che ORFs B, C, E and 
F of GS in Mptb and Mavs, and Che polypepcides they are now known 
to encode as a result of our invention, are as follows: 

ORF B: Seq. ID No 30 Cosmid MTCY277 nucleocides 35493 to 
34705 

Seq. ID No 31 Amino acid sequence encoded by Seq. ID 
No30. 

ORF C: Seq. ID No 32 Cosmid MTCY277 nucleotides 31972 to 32994 
Seq. ID No 33 Amino acid sequence encoded by Seq. ID 
No32. 

ORF E: Seq. ID No 34 Cosmid MTCY277 nucleotides 34687 to 33956 
Seq. ID No 3 5 Amino acid sequence encoded by Seq. ID 
No34. 

ORF E: Seq. ID No 36 Cosmid MT024 nucleotides 15934 to 15203 
Seq. ID No 37 Amino acid sequence encoded by Seq. ID 
No36. 

ORF F: Seq. ID No38 Cosmid MT024 nucleocides 15133 Co 14306 
Seq. ID No 39 Amino acid sequence encoded by Seq. ID 
No38. 

The proceins and pepcides encoded by che ORFs A Co K in Mptb and 
Mavs and Che amino acid sequences from homologous genes we have 
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discovered in Mtb given in Seq.ID Nos 31. 33, 35, 37 and 39, as 
described above and fragments thereof, comprise the polypeptides 
of the invention. The polypeptides of the invention are believed 
to be associated with specific immunoreactivity and with the 
pathogenicity of the host micro-organisms from which they were 
obtained. 

The present invention thus provides a polynucleotide in- 
substantially isolated form which is capable of selectively 
hybridising to sequence ID Nos 3 or 4 or a fragment thereof . The 
polynucleotide fragment may alternatively comprise a sequence 
selected from the group of Seq.ID. No: 5, 7, 9, 11, 13, 15, 17, 
19 21 23, 25 and 27. The invention further provides a 
polynucleotide in substantially isolated form whose sequence 
consists essentially of a sequence selected from the group Sea 
i ID Nos. 30, 32, 34, 36 and 38, or a corresponding sequence 
selectively hybridizable thereto, or a fragment of said sequence 
or corresponding sequence. 

The invention further provides diagnostic probes such as a probe 
which comprises a fragment of at least 15 nucleotides of a 
20 polynucleotide of the invention, or a peptide nucleic acid or 
similar synthetic sequence specific ligand, optionally carryxng 
a reveal ing label. The invention also provides a vector carrying 
a polynucleotide as defined above, particularly an expression 
vector . 

25 The invention further provides a polypeptide in substantially 
isolated form which comprises any one of the sequences selec-x<, 
from the group consisting Seq.ID.No: 6, 8, 10, 12, 14, 16, 18, 
20, 22, 24, 26, 28, 29, 31. 33, 35, 37 and 39, or a polypeptide 
substantially homologous thereto. The invention additionally 

30 provides a polypeptide fragment which comprises a fragment of a 
polypeotide defined above, said fragment comprising at least a 
amino "acids and an epitope. The invention also provides 
polynucleotides in substantially isolated form which sncoda 
oolyoeotides of the invention, and vectors which comprise such 

35 polynucleotides, as well as antibodies capable of binding sucn 
oolypeotides. In an additional aspect, the invention provide 
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kits comprising polynucleotides, polypeptides, antibodies or 
synthetic ligands of the invention and methods of using such kits 
in diagnosing the presence or absence of mycobacteria in a 
sample. The invention also provides pharmaceutical compositions 

5 comprising polynucleotides of the invention, polypeptides of the 
invention or antisense probes and the use of such compositions 
in the treatment or prevention of diseases caused by 
mycobacteria. The invention also provides polynucleotihe 
prevention and treatment of infections due to GS -containing 

10 pathogenic mycobacteria in animals and humans and as a means of 
enhacing in vivo susceptibility of said mycobacteria to 
antimicrobial drugs. The invention also provides bacteria or 
viruses transformed with polynucleotides of the invention for use 
as vaccines. The invention further provides Mptb or Mavs in 

15 which all or part or the polynucleotides of the invention have 
been deleted or disabled to provide mutated organisms of lower 
pathogenicity for use as vaccines in animals and humans. The 
invention further provides Mtb in which all or part of the 
polynucleotides encoding polypeptides of the invention have been 

20 deleted ' or disabled to provided mutated organisms or lower 
pathogenicity for use as vaccines in animals and humans. 

A further aspect of the invention is our discovery of homologies 
between the ORFs 3, C and E in GS on the one hand, and Mtb cosmid 
MTCY277 on the other (data from Genbank database using the 

25 computer programmes BLAST and BLIXEM) . The homologous ORFs in 
MTCY277 are adjacent to one another consistent with the form of 
another pathogenicity island in Mtb. A further aspect of the 
invention is our discovery of homologies between ORFs E and F m 
GS and Mtb cosmid MT024 (also Genbank, as above) with the 

30 homologous ORFs close to one another. The use of polynucleotides 
and polypeptides from Mtb (Seq. ID Nos 30,31, 32, 33, 34, 35, 36, 
37, *38 and 39) in substantially isolated form as diagnostics, 
vaccines and targets for chemotherapy, for the management and 
prevention of Mtb infections in humans and animals, and the 

35 processes involved in the preparation and use of these 
diagnostics, vaccines and new chemotherapeutic agents, comprise 
further aspects of the invention. 
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A. Polynucleotides 

Polynucleotides of the invention as defined herein may comprise 
DNA or RNA. They may also be polynucleotides which include 
within them synthetic or modified nucleotides or peptide nucleic 
acids. A number of different types of modification to 
oligonucleotides are known in the art. These include 
methylphosphonate and phosphorothioate backbones, addition of 
acridine or polylysine chains at the 3' and/or 5' ends of the 
molecule. For the purposes of the present invention, it is to 
be understood that the polynucleotides described herein may be 
modified by any method available in the art. Such modifications 
may be carried out in order to couple the said polynucleotide to 
a solid phase or to enhance the recognition, the in vivo 
activity, or the lifespan of polynucleotides of the invention. 

A number of different types of polynucleotides of the invention 
are envisaged. In the broadest aspect, polynucleotides and 
fragments thereof capable of hybridizing to SEQ ID NO: 3 or 4 form 
a first aspect of the invention. This includes the 
polynucleotide of SEQ ID NO: 3 or 4. Within this class of 
polynucleotides various sub-classes of polynucleotides are of 
particular interest . 

One sub-class of polynucleotides which is of interest is the 
class of polynucleotides encoding the open reading frames A, B, 
C, D, E, F, G and H, including SEQ ID N0s:5, 7, 9, 11, 13, 15, 
17, 19, 21, 23, 25 and 27. As discussed below, polynucleotides 
encoding ORF H include the polynucleotide sequences 7953 to 7006 
and 7009 to 6215 within SEQ ID NO: 27, as well as modified 
sequences in which the frame-shift has been modified so that the 
two sub-reading frames are placed in a single reading frame. 
This may be desirable where the polypeptide is to be produced in 
recombinant expression systems. 

The invention thus provides a polynucleotide in substantially 
isolated form which encodes any one of these ORFs or combinations 
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thereof. Combinations thereof includes combinations of 2, 3, 4, 
5 or all of the ORFs. Polynucleotides may be provided which 
comorise an individual ORF carried in a recombinant vector 
including the vectors described herein. Thus in one preferred 

5 aspect the invention provides a polynucleotide in substantially 
isolated form caoable of selectively hybridizing to the nucleic 
acid comprising ORFs A to F of the core region of the Mvtb and 
Mavs pathogenicity islands of the invention. Fragments thereof 
corresponding to ORFs A to E, B to F, A to D, B to E. A to C, B 

10 to D or any two adjacent ORFs are also included in the invention. 

Polynucleotides of the invention will be capable of selectively 
hybridizing to the corresponding portion of the GS region, or to 
the corresponding ORFs of Met described herein. The term 
-selectively hybridizing" indicates that the polynucleotides will 
15 hybridize, under conditions of medium to high stringency for 
example 0.03 M sodium chloride and 0.03 M sodium citrate at from 
about 50OC to about 60=>C) to the corresponding portion or SEQ ID 
NO- 3 or 4 or the complementary strands thereof but not to genomic 
DNA from mycobacteria which are usually non- pathogenic including 
20 non-pathogenic species of M.avium. Such polynucleotides will 
generally be generally at least 68%, eg at least 70* 
Preferably at least 80 or 90% and more preferably at least 95. 
homologous to the corresponding DNA of GS. The corresponding 
portion will be of over a region of at least 20, preferably at 
25 least 30, for instance at least 40, 60 or 100 or more contiguous 
nucleotides . 

By "corresponding portion- it is meant a sequence from the GS 
region o£ one same or substantially similar size which has been 
determined, for example by computer alignment, to have the 
30 greatest degree of homology to the polynucleotide. 

Any combination of the above mentioned degrees of homology and 
minimum sizes may be used to define polynucleotides of the 
invention, with the more stringent combinations (i.e. h-gher 
homology over longer lengths, being preferred. Thus for examp e 
35 a polynucleotide which is at least 80% homologous over 2S. 
preferably 30 nucleotides forms one aspect of the invention, as 
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does a polynucleotide which is at least 90% homologous over 40 
nucleotides . 

A further class of polynucleotides of the invention is the class 
of polynucleotides encoding polypeptides of the invention, the 
5 polypeptides of the invention being defined in section B below. 
Due to the redundancy of the genetic code as such, 
polynucleotides may be of a iower degree of homology than 
required for selective hybridization to the GS region. However, 
when such polynucleotides encode polypeptides of the invention 

10 these polynucleotides form a further aspect. It may for example 
be desirable where polypeptides of the invention are produced 
recombinantly to increase the GC content of such polynucleotides. 
This increase in GC content may result in higher levels of 
expression via codon usage more appropriate to the host cell in 

15 which recombinant expression is taking place. 

An additional class of polynucleotides of the invention are those 
obtainable from cosmids MTCY277 and MT024 (containing Mtb genomic 
sequences) , which polynucleotides consist essentially of the 
fragment of the cosmid containing an open reading frame encoding 

20 any one of the homologous ORFs B # C, E or F respectively. Such 
polynucleotides are referred to below as Mtb polynucleotides. 
However, where reference is made to polynucleotides in general 
such reference includes Mtb polynucleotides unless the context 
is explicitly to the contrary. In addition, the invention 

25 provides polynucleotides which encode the same polypeptide as the 
abovementioned ORFs of Aftib but which, due to the redundancy of 
the genetic code, have different nucleotide sequences. Tk^-a 
form further Mtb polynucleotides of the invention. Fragments of 
Mtb polynucleotides suitable for use as probes or primers also 

30 form a further aspect of the invention. 

The invention further provides polynucleotides in substantially 
isolated form capable of selectively hybridizing (where 
selectively hybridizing is as defined above) to the Mtb 
polynucleotides of the invention. 



WO 97/23624 PCT/C B96/0322 « 

- 11 - 

The invention further provides the Mtb polynucleotides of the 
invention linked, at either the 5' and/or 3' end to 
polynucleotide sequences to which they are not naturally 
contiguous. Such sequences will typically be sequences found in 
cloning or expression vectors, such as promoters, 5' untranslated 
sequence, 3' untranslated sequence or termination sequences . The 
sequences may also include further coding sequences such as 
signal sequences used in recombinant production of proteins. 

Further polynucleotides of the invention are illustrated in the 

• — •*»•.*• ...... 

accompanying examples. 

Polynucleotides of the invention may be used to produce a primer, 
eg a PCR primer, a primer for an alternative amplification 
reaction, a probe e.g. labelled with a revealing label by 
conventional means using radioactive or non- radioactive labels 
or a probe linked covalently to a solid phase, or the 
polynucleotides may be cloned into vectors. Such primers, 
probes and other fragments will be at least 15, preferably at 
least 20, for example at least 25, 30 or 40 or more nucleotides 
in length, and are also encompassed by the term polynucleotides 
of the invention as used herein. 

Primes of the invention which are preferred include primers 
directed to any part of the ORFs defined herein. The ORFs from 
other isolates of pathogenic mycobacteria which contain a GS 
region may be determined and conserved regions within each 
individual ORF may be identified. Primers directed to such 
conserved regions form a further preferred aspect of the 
invention. In addition, the primers and other polynucleotides 
of the invention may be used to identify, obtain and isolate ORFs 
capable of selectively hybridizing to the polynucleotides of the 
invention which are present in pathogenic mycobacteria but which 
are not part of a pathogenicity island in that particular species 
of bacteria. Thus in addition to the ORFs B, C, E and F which 
have been identified in tftib, similar ORFs may be identified in 
other pathogens and ORFs corresponding to the GS ORFs C, D, E, 
F and H, may also be identified. 
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Polynucieo tides such as DNA polynucleotides and probes according 
to the invention may be produced recombinant ly, synthetically, 
or by any means available to those of skill in the art . They may 
also be cloned by standard techniques. 

In general, primers will be produced by synthetic means, 
involving a step-wise manufacture of the desired nucleic acid 
sequence one nucleotide at a time. Techniques for accomplishing 
this using automated techniques are readily available in the art. 
Longer polynucleotides will generally be produced using 
recombinant means, for example using a PCR (polymerase chain 
reaction) cloning techniques. This will involve making a pair 
or primers (e.g. of about 15-30 nucleotides) to a region of GS, 
which it is desired to clone, bringing the primers into contact 
with genomic DNA from a mycobacterium or a vector carrying the 
15 GS sequence, performing a polymerase chain reaction under 
conditions which bring about amplification of the desired region, 
isolating the amplified fragment (e.g. by purifying the reaction 
mixture on an agarose gel) and recovering the amplified DNA. The 
primers may be designed to contain suitable restriction enzyme 
recognition sites so that the amplified DNA can be cloned into 
a suitable cloning vector. 

Such techniques may be used to obtain ail or part of the GS or 
ORF sequences described herein, as well as further genomic clones 
containing full open reading frames. Although in general such 
techniques are well known in the art, reference may be made in 
particular to Sambrook J., Fritsch EF., Maniatis T (1989). 
Molecular cloning: a Laboratory Manual, 2nd edn. Cold Spring 
Harbor, New York, Cold Spring Harbor Laboratory. 



35 



Polynucleotides which are not 100% homologous to the sequences 
of the present invention but fall within the scope of the 
invention can be obtained in a number of ways. 

Other isolates or strains of pathogenic mycobacteria will be 
expected to contain allelic variants of the GS sequences 
described herein, and these may be obtained for example by 
probing genomic DNA libraries made from such isolates or strains 
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of bacteria using GS or ORF sequences as probes under conditions 
of medium to high stringency (for example 0.03M sodium chloride 
and 0.03M sodium citrate at from about S0°C to about S0°C) . 

A particularly preferred group of pathogenic mycobacteria are 
isolates of tf. para tuberculosis. Polynucleotides based on GS 
regions from such bacteria are particularly preferred. Preferred 
fragments of such regions include fragments encoding individual 
open reading frames including the preferred groups and 
combinations of open reading frames discussed above. 

Alternatively, such polynucleotides may be obtained by site 
directed mutagenesis of the GS or ORF sequences or allelic 
variants thereof. This may be useful where for example silent 
codon changes are required to sequences to optimise codon 
preferences for a particular host cell in which the 
polynucleotide sequences are being expressed. Other sequence 
changes may be desired in order to introduce restriction enzyme 
recognition sites, or to alter the property or function of the 
polypeptides encoded by the polynucleotides of the invention. 
Such altered property or function will include the addition of 
amino acid sequences of consensus signal peptides known m the 
art to effect transport and secretion of the modified polypeptide 
of the invention. Another altered property will include 
metagenesis of a catalytic residue or generation of fusion 
proteins with another polypeptide. Such fusion proteins may be 
with an enzyme, with an antibody or with a cytokine or other 
ligand for a receptor, to target a polypeptide of the invention 
to a specific cell type in vitro or in vivo. 

The invention further provides double stranded polynucleotides 
comprising a polynucleotide of the invention and its complement. 

Polynucleotides or primers of the invention may carry a revealing 
label Suitable labels include radioisocopes such as ? or s. 
enzyme labels, other protein labels or smaller labels sucn as 
biotin or fluorophores. Such labels may be added to 
polynucleotides or primers of the inversion and may be detectea 
using by techniques known per se. 
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Polynucleotides or primers of the invention or fragments thereof 
labelled or unlabelled may be used by a person skilled in the art 
in nucleic acid-based tests for the presence or absence of Mptb, 
Mavs, other GS-containing pathogenic mycobacteria, or Mtb applied 
5 to samples of body fluids, tissues, or excreta from animals and 
humans, as well as to food and environmental samples such as 
river or ground water and domestic water supplies. 

Human and animal body fluids include sputum, blood, serum, 
plasma, saliva, milk, urine, csf, semen, faeces and infected 
10 discharges. Tissues include intestine, mouth ulcers, skin, lymph 
nodes, spleen, lung and liver obtained surgically or by a biopsy 
technique. Animals particularly include commercial livestock 
such as cattle, sheep, goats, deer, rabbits but wild animals and 
animals in zoos may also be tested. 

15 Such tests comprise bringing a human or- animal body fluid or 
tissue extract, or an extract of an environmental or food sample, 
into. contact with a probe comprising a polynucleotide or primer 
of the invention under hybridising conditions and detecting any 
duplex formed between the probe and nucleic acid in the sample. 

20 Such detection may be achieved using techniques such as PCR or 
by immobilising the probe on a solid support, removing nucleic 
acid in the sample which is not hybridized to the probe, and then 
detecting nucleic acid which has hybridized to the probe. 
Alternatively, the sample nucleic acid may be immobilized on a 

25 solid support, and the amount of probe bound to such a support 
can be detected. Suitable assay methods of this any onher 
formats can be found in for example WO89/03891 and WO90/13567, 

Polynucleotides of the invention or fragments thereof labelled 
or unlabelled may also be used to identify and characterise 
30 different strains of Mpdb, Mavs, other GS-containing pathogenic 
mycobacteria, or Mtb, and properties such as drug resistance or 
susceptibility. 

The probes of the invention may conveniently be packaged in the 
form of a test kit in a suitable container. In such kits the 
35 probe may be bound to a solid support where the assay format- 
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which Che kit is designed requires such binding. The kit may 
also concain suitable reagents for treating the sample to be 
probed, hybridising the probe to nucleic acid in the sample, 
control reagents, instructions, and the like. 

5 The use of polynucleotides of the invention in the diagnosis of 
inflammatory diseases such as Crohn's disease or sarcoidosis in 
humans or Johne's disease in animals form a preferred aspect of 
the invention. The polynucleotides may also be used in the 
prognosis of these diseases. For example, the response of a 

10 human or animal subject in response to antibiotic, vaccination 
or other therapies may be monitored by utilizing the diagnostic 
methods of the invention over the course of a period of treatment 
and following such treatment. 

The use of Mtb polynucleotides (particularly in the form of 
15 probes and primers) of the invention in the above -described 
methods form a further aspect of the invention, particularly for 
the detection, diagnosis or prognosis of Mtb infections. 



g Polypeptides. 

Polypeptides of the invention include polypeptides in 

20 substantially isolated form encoded by GS. This includes the 
full length polypeptides encoded by the positive and 
complementary negative strands of GS. Each of the full length 
polypeptides will contain one of the amino acid sequences set out 
in Seq ID NOs:6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 

25 29. Polypeptides of the invention further include variants of 
such sequences, including naturally occurring allelic variants 
and synthetic variants which are substantially homologous to said 
polypeptides. In this context, substantial homology is regarded 
as a" sequence which has at least 70%, e.g. 80%, 90%, 95% or 98% 

30 amino acid homology (identity) over 30 or more, e.g 40, 50 or 100 
amino acids. For example, one group of substantially homolgous 
polypeptides are those which _ have at least 95% amino acid 
identity to a polypeptide of any one of Seq ID NOs : 6 , 8, 10, 12, 
14, 16, 18, 20, 22, 24, 26, 28 and 29 over their entire length. 

35 Even more preferably, this homology is 98%. 
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Polypepcides of the invencion further include Che polypeptide 
sequences of the homologous ORFs of Mtb, namely Seq ID Nos . 31, 
33, 35, 37 and 39. Unless explicitly specified to the contrary, 
reference to polypeptides of the invention and their fragments 
5 include these Mtb polypeptides and fragments, and variants 
thereof (substanially homologous to said sequences) as defined 
herein. 

Polypeptides of the invention may be obtained by the standard 
techniques mentioned above. Polypeptides of the invention also 

10 include fragments of the above mentioned full length polypeptides 
and variants thereof, including fragments of the sequences set 
out in SEQ ID NOs:S, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 
29, 31, 33, 35, 37 and 39. Such fragments for example of 8, 10, 
12, 15 or up to 30 or 40 amino acids may also be obtained 

15 synthetically using standard techniques known in the art. 

Preferred fragments include those which include an epitope, 
especially an epitope which is specific to the pathogenicity of 
the mycobacterial cell from which the polypeptide is derived. 
Suitable fragments will be at least about 5, e.g. 8, 10, 12, 15 
20 or 20 amino acids in size, or larger. Epitopes may be determined 
either by techniques such as peptide scanning techniques as 
described by Geysen et al, Mol . Immunol . , 22; 709-715 (1986), as 
well as other techniques known in the art. 



25 



30 



35 



The term "an epitope which is specific to the pathogenicity of 
the mycobacterial cell" means that the epitope is encoded by a 
portion of the GS region, or by the corresponding ORF sequences 
of Mtb which can be used to distinguish mycobacteria which are 
pathogenic by from related non-pathogenic mycobacteria including 
non-pathogenic species of Af. avium. This may be determined using 
routine methodology. A candidate epitope from an ORF may be 
prepared and used to immunise an animal such as a rat or rabbit 
in order to generate antibodies. The antibodies may then be used 
to detect the presence of the epitope in pathogenic mycobacteria 
and to confirm that non-pathogenic mycobacteria do^ not contain 
any proteins which react with the epitope. Epitopes may be 
linear or conformational. 
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Polypeptides of the invention may be in a substantially isolated 
form. It will be understood that the polypeptide may be mixed 
with carriers or diluents which will not interfere with the 
intended purpose of the polypeptide and still be regarded as 
5 substantially isolated. A polypeptide of the invention may also 
be in a substantially purified form, in which case it will 
generally comprise the polypeptide in a preparation in which more 
than 90%, e.g. 95%, 98% or 99% of the polypeptide in the 
preparation is a polypeptide of the invention. 

10 Polypeptides of the invention may be modified to confer a desired 
property or function for example by the addition of Histidine 
residues to assist their purification or by the addition of a 
signal sequence to promote their secretion from a cell. 

Thus, polypeptides of the invention include fusion proteins which 
15 comprise a polypeptide encoding all or part of one or more of an 
ORF of the invention fused at the N- or C- terminus to a second 
sequence to provide the desired property or function, Sequences 
which promote secretion from a cell include, for example the 
yeast a- factor signal sequence. 

20 A polypeptide of the invention may be labelled with a revealing 
label. The revealing label may be any suitable label which 
allows the polypeptide to be detected. Suitable labels include 
radioisotopes, e.g. 12S I, 3S S enzymes, antibodies, polynucleotides 
and ligands such as biotin. Labelled polypeptides of the 

25 invention may be used in diagnostic procedures such as 
immunoassays in order to determine the amount of a polypeptide 
of the invention in a sample. Polypeptides or labelled 
polypeptides of the invention may also be used in serological or 
cell mediated immune assays for the detection of immune 

30 reactivity to said polypeptides in animals and humans using 
standard protocols. 

A polypeptide or labelled polypeptide of the invention or 
fragment thereof may also be fixed to a solid phase, for example 
the surface of an immunoassay well, microparticle , dipstick or 
35 biosensor. Such labelled and/cr immobilized polypeptides may be 
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packaged into kits in a suitable container along with suitable 
reagents, controls, instructions and the like. 

Such polypeptides and kits may be used in methods of detection 
of antibodies or ceil mediated immunoreactivity, to the 
mycobacterial proteins and peptides encoded by the ORFs of the 
invention and their allelic variants and fragments, using 
immunoassay. Such host antibodies or cell mediated immune 
reactivity will occur in humans or animals with an immune system 
which detects and reacts against polypeptides of the invention. 
The antibodies may be present in a biological sample from such 
humans or animals, where the biological sample may be a sample 
as defined above particularly blood, milk or saliva. 

Immunoassay methods are well known in the art and will generally 
comprise : 

(a) providing a polypeptide of the invention comprising an 
epitope bindable by an antibody against said 
mycobacterial polypeptide; 

(b) incubating a biological sample with said polypeptide 
under conditions which allow for the formation of an 
antibody -antigen complex; and 

(c) determining whether antibody-antigen complex 
comprising said polypeptide is formed. 

Immunoassay methods for cell mediated immune reactivity in 
animals and humans are also well known in the art (e.g. as 
described by Weir et al 1994, J. Immunol Methods Hi; 93-1C1! and 
will generally comprise 

(a) providing a polypeptide of the invention comprising an 
epitope bindable by a lymphocyte or macrophage or 
other cell receptor; 

(b) incubating a cell sample with said polypeptide uvi-ar 
conditions which allow for a cellular immune resp^ns^ 
such as release of cytokines or other mediator to 
occur; and 

(c) detecting the presence of said cytokine or mediator in 
the incubate . 
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Polypeptides of the invention may be made by standard synthetic 
means well known in the art or recombinant ly, as described below. 

Polypeptides of the invention or fragments thereof labelled or 
unlabelled may also be used to identify and characterise 
different strains of Mptb, Mavs, other GS-containing pathogenic 
mycobacteria, or Mtb, and properties such as drug resistance or 
susceptibility. 

The polypeptides of the invention may conveniently be packaged 
in the form of a test kit in a suitable container. In such kits 
the polypeptide may be bound to a solid support where the assay 
format for which the kit is designed requires such binding. The 
kit may also contain suitable reagents for treating the sample 
to be examined, control reagents, instructions, and the like. 

The use of polypeptides of the invention in the diagnosis of 
inflammatory diseases such as Crohn's disease or sarcoidosis in 
humans or Johne's disease in animals form a preferred aspect of 
the invention. The polypeptides may also be used in the 
prognosis of these diseases. For example, the response of a 
human or animal subject: in response to antibiotic or other 
theraoies may be monitored by utilizing the diagnostic methods 
of the invention over the course of a period of treatment and 
following such treatment . 

The use of Mtb polypeptides of the invention in the above- 
described methods form a further aspect of the invention, 
particularly for the detection, diagnosis or prognosis of Mtb 
infections . 

Polypeptides of the invention may also be used in assay methods 
for identifyinc candidate chemical compounds which will be useful 
in inhibiting, binding to or disrupting the function of saia 
polypeptides required for pathogenicity. In general, such assays 
involve bringing the polypeptide into contact with a candidate 
inhibitor compound and observing the ability of the compound to 
disrupt, bind to or interfer with the polypeptide. 
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There are a number of ways in which the assay may be formatted. 
For example, those polypeptides which have an enzymatic function 
may be assayed using labelled substrates for the enzyme, and the 
amount of, or rate of,, conversion of the substrate into a product 
5 measured, e.g by chromatograpy such as HPLC or by a colourimetric 
assay. Suitable labels include 3S S, 125 1, biotin or enzymes such 
as horse radish peroxidase. 

For example, the gene product of ORF C is believed to have GDP- 
mannose dehydratase activty. Thus an assay for inhbitors of the 

10 gene product may utilise for example labelled GDP-mannose, GDP 
or mannose and the activity of the gene product followed. ORF 
D encodes a gene related to the synthesis and regulation of 
capuslar polysaccharides, which are often associated with 
invasiveness and pathogenicity. Labelled polysaccharide 

15 substrates may be used in assays of the ORF D gene product. The 
gene product of ORF F encodes a protein with putative glucosyl 
transferase activity and thus labelled amino sugars such as 0-1- 
3-N-acetylglucosamine may be used as subscrates in assays . 

Candidate chemical compounds which may be used may be natural or 
20 synthetic chemical compounds used in drug screening programmes. 
Extracts of plants which contain several characterised or 
uncharacterised components may also be used. 

Alternatively, the a polypeptide of the invention may be screened 
against a panel of peptides, nucleic acids or other chemical 

25 functionalities which are generated by combinatorial chemistry. 
This will allow the definition of chemical entities which bind 
to polypeptides of the invention. Typically, the polypeptide of 
the invention will be brought into contact with a panel of 
compounds from a combinantorial library, with either the panel 

30 or the polypeptide being immobilized on a solid phase, under 
conditions suitable for the polypeptide to bind to the panel. 
The solid phase will then be washed under conditions in which 
only specific interactions between the polypeptide and individual 
members of the panel are retained, and those specific members may 

35 be utilized in further assays or used to design further panels 
of candidate ompounds . 
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For example, a number of assay methods to define peptide 
interaction with peptides are known. For example, WO86/00991 
describes a method for determining mimotopes which comprises 
making panels of catamer preparations, for example octamers of 
amino acids, at which one or more of the positions is defined and 
the remaining positions are randomly made up of other amino 
acids, determining which catamer binds to a protein of interest 
and re-screening the protein of interest against a further panel 
based on the most reactive catamer in which one or more 
additional designated positions are systematically varied. This 
may be repeated throughout a number of cycles and used to build 
up a sequence of a binding candidate compound of interest. 

WO89/03430 describes screening methods which permit the 
preparation of specific mimotopes which mimic the immunological 
activity of a desired analyte. These mimotopes are identified 
by reacting a panel of individual peptides wherein said peptides 
are of systematically varying hydrophobicity, amphipathic 
characteristics and charge patterns, using an antibody against 
an antigen of interest. Thus in the present case antibodies 
against the a polypeptide of the inventoin may be employed and 
mimotope peptides from such panels may be identified. 

C. Vectors. 



Polynucleotides of the invention can be incorporated into a 
recombinant replicable vector. The vector may be used to 
replicate the nucleic acid in a compatible host cell. Thus in 
a further embodiment, the invention provides a method of making 
polynucleotides of the invention by introducing a polynucleotide 
of the invention into a replicable vector, introducing the vector 
into a compatible host cell, and growing the host cell under 
conditions which bring about replication of the vector. The 
vector may be recovered from the host cell. Suitable host cells 
are described below in connection with expression vectors. 



D . Expression Vectors 
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Preferably, a polynucleotide of the invention in a vector is 
operabiy linked to a control sequence which is capable of 
providing for the expression of the coding sequence by the host 
cell, i.e. the vector is an expression vector. The term "operabiy 
5 linked" refers to a juxtaposition wherein the components 
described are in a relationship permitting them to function in 
\ their intended manner. A control sequence "operabiy linked" to 
a coding sequence is ligated in such a way that expression of the 
coding sequence is achieved under conditions compatible with the 

10 control sequences. Such vectors may be transformed into a 
suitable host cell as described above to provide for expression 
of a polypeptide of the invention. Thus, in a further aspect the 
invention provides a process for preparing polypeptides according 
to the invention which comprises cultivating a host cell 

15 transformed or transfected with an expression vector as described 
above, under conditions to provide for expression by the vector 
of a coding sequence encoding the polypeptides, and recovs.vi:.:g 
the expressed polypeptides. 

A further embodiment of the invention provides vectors for the 

20 replication and expression of polynucleotides of the invention, 
or fragments thereof. The vectors may be for example, plasmid, 
virus or phage vectors provided with an origin of replication, 
optionally a promoter for the expression of the said 
polynucleotide and optionally a regulator of the promoter. The 

25 vectors may contain one or more selectable marker genes, for 
example an ampicillin resistance gene in the case of a bacterial 
plasmid or a neomycin resistance gene for a mammalian vector. 
Vectors may be used in vitro, for example for the production of 
RNA or used to transfect or transform a host cell. The vector 

30 may also be adapted to be used in vivo, for example in a method 
of naked DNA vaccination or gene therapy. A further embodiment 
of the invention provides host cells transformed or . transfected 
with the vectors for the replication and expression ">f 
polynucleotides of the invention, including the DNA of GS, cue 

35 open reading frames thereof and other corresponding ORFs 
particularly ORFs B, C, E and F from Mtb. The cells will be 
chosen to be compatible with the said vector and may for example 
be bacterial, yeast, insect or mammalian. 
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Expression vectors are widely available in Che arc and can be 
obtained commercially. Mammalian expression veccors may comprise 
a mammalian or viral promoter. Mammalian promocers include the 
metallothionien promoter. Viral promoters include promoters from 
adenovirus, the SV40 large T promoter and retroviral LTR 
promoters. Promoters compatible with insect cells include the 
polyhedrin promoter. Yeast promoters include the alcohol 
dehydrogenase promoter. Bacterial promoters include the 
/3-galactosidase promoter. 

The expression vectors may also comprise enhancers, and in the 
case of eukaryotic vectors polyadenylation signal sequence 
downstream of the coding sequence being expressed. 

Polypeptides of the invention may be expressed in suitable host 
cells, "for example bacterial, yeast, plant, insect and mammalian 
cells, and recovered using standard purification techniques 
including, for example affinity chromatography, KPLC or other 
chromatographic separation techniques. 

Polynucleotides according to the invention may also be inserted 
into the vectors described above in an antisense orientation in 
order to provide for the production of antisense RNA. Antisense 
RNA or other antisense polynucleotides or ligands may also be 
produced by synthetic means. Such antisense polynucleotides may 
be used in a method of controlling the levels of the proteins 
encoded by the ORFs of the invention in a mycobacterial cell. 

Polynucleotides of the invention may also be carried by vectors 
suitable for gene therapy methods. Such gene therapy methods 
include those designed to provide vaccination against diseases 
caused by pathogenic mycobacteria or to boost the immune response 
of a human or animal infected with a pathogenic mycobacteria. 

For examole, Ziegner et ml, AIDS, 1995, 9;43-50 describes the use 
of a replication defective recombinant amphotropic retrovirus to 
boost the immune response in patients with HIV infection. Such 
a retrovirus may be modified to carry a polynucleotide encoding 
a polypeptide or fragment thereof of the invention and the 
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retrovirus delivered to the cells of a human or animal subject 
in order to provide an immune response against said polypeptide. 
The retrovirus may be delivered directly to the patient or may 
be used to infecte cells ex-vivo, e.g. fibroblast cells, which 
are then introduced into the patient, optionally after being 
inactivated. The cells are desirably autologous or HLA-matched 
cells from the human or animal subject. 

Gene therapy methods including methods for boosting an immune 
response to a particluar pathogen are disclosed generally in for 
example WO95/14091, the disclosure of which is incoporated herein 
by reference. Recombinant viral vectors include retroviral 
vectors, adenoviral vectors, adeno-associated viral vectors, 
vaccinia virus vectors, herpes virus vectors and alphavirus 
vectors. Alpha virus vectors are described in, for example, 
WO95/07994, the disclosure of which is incorporated herein by 
reference . 

Where direct administration of the recombinant viral vector is 
contemplated, either in the form of naked nucleic acid or in the 
form of packaged particles carrying the nucleic acid this may be 
done by any suitable means, for example oral administration or 
intravenous injection. From 10 5 to 10* c.f .u of virus represents 
a typical dose, which may be repeated for example weekly over a 
period of a few months. Administration of autologous or HLA- 
matched cells infected with the virus may be more convenient, in 
some cases . This will generally 'be achieved by administering 
doses, for example from 10 s to 10* cells per dose which may be 
repeated as described above. 

The recombinant viral vector may further comprise nucleic acid 
capable of expressing an accessory molecule of the immune system 
designed to increase the immune response. Such a moleclue may 
be for example and interferon, particularly interferon gamma, an 
interleukin, for example IL-lo, IL-10 or IL-2, or an HLA class 
I or II moleclue. This may be particularly desirable where the 
vector is intended for use in the treatment of humans or animals 
already infected with a mycobacteria and it is desired to boost 
the immune response. 



WO 97/23624 



PCT/CB96/0322I 



' -25- 



E. Ant 3 bodies. 



The invention also provides monoclonal or polyclonal ancibodi.es 
to polypeptides of the invention or fragments thereof. The 
invention further provides a process for the production of 
monoclonal or polyclonal antibodies to polypeptides of the 
invention. Monoclonal antibodies may be prepared by conventional 
hybridoma technology using the polypeptides of the invention or 
peptide fragments thereof, as immunogens . Polyclonal antibodies 
may also be preoared by conventional means which comprise 
inoculating a host animal, for example a rat or a rabbit, with 
a polypeptide of the invention or peptide fragment thereof and 
recovering immune serum. 

in order that such antibodies may be made, the invention also 
provides polypeptides of the invention or fragments thereof 
haptenised to another polypeptide for use as immunogens in 
animals or humans. 

For the ourposes of this invention, the term "antibody, unless 
specified to the contrary, includes fragments of whole antibodies 
which retain their binding activity for a polypeptide of the 
invention. Such fragments include Fv, F(ab') and F(ab ) 2 
fragments, as well as single chain antibodies. Furthermore the 
antibodies and fragments thereof may be humanised antibodies, 
e.g. as described in EP-A-23 9400. 

Antibodies may be used in methods of detecting polypeptides of 
the invention present in biological samples (where such samples 
include the human or animal body samples, and environmental 
.samples, mentioned above) by a method which comprises: 

(a) providing an antibody of the invention; 

(b) incubating a biological sample with said antibody 
under conditions which allow for the formation of an 
antibody -antigen complex; and 

(c) determining whether antibody-antigen complex 
comprising said antibody is formed. 
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Antibodies of the invention may be bound co a solid supporc for 
examole an immunoassay well, microparticie, dipstick or biosensor 
and/or packaged into kits in a suitable container along with 
suitable reagents, controls, instructions and the like. 

Antibodies of the invention may be used in the detection, 
diagnosis and prognosis of diseases as descirbed above in 
relation to polypeptides of the invention. 



Compositions 



The present invention also provides compositions comprising a 
10 polynucleotide or polypeptide of the invention together with a 
carrier or diluent. Compositions of the invention also include 
comoositions comprising a nucleic acid, particularly and 
expression vector, of the invention. Compositions further 
include those carrying a recombinant virus of the invention 
15 Such comoositions include pharmaceutical compositions in which 
case the 'carrier or diluent will be pharmaceutical ly acceptable. 

Pharmaceutical^ acceptable carriers or diluents include those 
used in formulations suitable for inhalation as well as era 
parenteral (e.g. intramuscular or intravenous or transcutaneous) 
20 administration. The formulations may conveniencly be presented 
in unit dosage form and may be prepared by any of the methods 
well known in the art of pharmacy. Such methods include the step 
of bringing into association the active ingredient with the 
carrier which constitutes one or more accessory ingredients . In 
25 general the formulations are prepared by uniformly and intimately 
bringing into association the active ingredient with . liquid 
carriers or finely divided solid carriers or both, and then, i- 
necessary, shaping the product. 

For example, formulations suitable for parenteral administration 
30 include aqueous and non-aqueous sterile injection solutions .h o., 
may contain anti-oxidants. buffers, bacteriostats and so ; u e, 
which render the formulation isotonic witn tne blooo o. the 
intended recipient, and aqueous and non-aqueous ste 
suspensions which may include suspending agents ano ehictanuw 
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agents, and liposomes or ocher microparticulate systems which are 
designed to target the polynucleotide or the polypeptide of the 
invention to blood components or one or more organs, or to target 
cells such as M cells of the intestine after oral administration. 



Vaccines . 



In another aspect, the invention provides novel vaccines for the 
prevention and treatment of infections caused by Mptb, Mavs, 
other GS- containing pathogenic mycobacteria and Mtb in animals 
and humans. The term "vaccine" as used herein means an agent 

10 used to stimulate the immune system of a vertebrate, particularly 
a warm blooded vertebrate including humans, so as to provide 
protection against future harm by an organism to which the 
vaccine is directed or to assist in the eradication of an 
organism in the treatment of established infection. The immune 

15 system will be stimulated by the production of cellular immunity 
antibodies, desirably neutralizing antibodies, directed to 
epitooes found on or in a pathogenic mycobacterium which 
expresses any one of the ORFs of the invention. The antibody so. 
produced may be any of the immunological classes, such as the 

20 immunoglobulins A, D, E, G or M. Vaccines which stimulate the 
production of IgA are interest since this is the principle 
immunoglobulin produced by the secretory system of warm-bloodea 
animals, and the production of such antibodies will help prevent 
infection or colonization of the intestinal tract. However an 

25 IgM and IgG response will also be desirable for systemic 
infections such as Crohn's disease or tuberculosis. 

Vaccines of the invention include polynucleotides of the 
invention or fragments thereof in suitable vectors and 
administered by injection of naked DNA using standard protocols. 

30 Polynucleotides of the invention or fragments thereof in suitable 
vectors for the expression of the polypeptides of the invention 
may be given by injection, inhalation or by mouth. Suitable 
vectors include M.bovis BCG, M.smegmatis or other mycobacteria, 
Corynebacceria, Salmonella or other agents according to 

35 established protocols. 
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Polypeptides of Che invention or fragments thereof in 
substantially isolated form may be used as vaccines by injection, 
inhalation, oral administration or by transcutaneous application 
according to standard protocols. Adjuvants (such as Iscoms or 
polylactide-coglycolide encapsulation) , cytokines such as IL-12 
and other immunomodulators may be used for the selective 
enhancement of the cell mediated or humoral immunological 
responses. Vaccination with polynucleotides and/or polypeptides 
of the invention may be undertaken to increase the susceptibility 
of pathogenic mycobacteria to antimicrobial agents in vivo. 

In instances wherein the polypeptide is correctly configured so 
as to provide the correct epitope, but is too small to be 
immunogenic, the polypeptide may be linked to a suitable carrier. 

A number of techniques for obtaining such linkage are known in 
the art, including the formation of disulfide linkages using N- 
succinimidyl-3- (2-pyridyithio) propionate (SPDP) and succinimidyl 
4-(N-maleimido-methyl)cyclohexane-l-carboxylate (SMCC) obtained 
from Pierce Company, Rockford, Illinois, (if the peptide lacks 
a sulfhydryl group, this can be provided by addition of a 
cysteine residue) . These reagents create a disulfide linkage 
between themselves and peptide cysteine residues on one protein 
and an amide linkage through the epsi Ion- amino on a lysine, or 
other free amino group in the other. A variety of such 
disulfide/amide- forming agents are known. See, for example, 
Immun Rev (1982) 62:185. Other bifunctional coupling agents form 
a thioether rather than a disulfide linkage. Many of these thio- 
ether- forming agents are commercially available and include 
reactive esters of 6-maleimidocaproic acid, 2-bromoacetic acid, 
2-iodoacetic acid, 4- (N-maleimido-methyl) cyclohexane-l-carboxylic 
acid, and the like. The carboxyl group can be activated by 
combining them with succinimide or i-hydroxyl-2-nitro-4 -sulfonic 
acid, sodium salt. Additional methods of coupling antigens 
employs the rotavirus/ "binding peptide" system described in EPO 
Pub. No. 259,149, the disclosure of which is incorporated herein 
by reference. The foregoing lisc is not meant to be exhaustive, 
and modifications of the named compounds can clearly be used. 
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Any carrier may be used which does not itself induce the 
production of antibodies harmful to the host. Suitable carriers 
are typically large, slowly metabolized macromolecules such as 
proteins; polysaccharides, such as latex functionalized 
Sepharose®, agarose, cellulose, cellulose beads and the like; 
polymeric amino acids, such as polyglutamic acid, polylysine, 
polylactide-coglycolide and the like; amino acid copolymers; and 
inactive virus particles. Especially useful protein substrates 
are serum albumins, keyhole limpet hemocyanin, immunoglobulin 
molecules, thyroglobulin, ovalbumin, tetanus toxoid, and other 
proteins well known to those skilled in the art. 

The immunogenic ity of the epitopes may also be enhanced by 
preparing them in mammalian or yeast systems fused with or 
assembled with part icle- forming proteins such as, for example, 
that associated with hepatitis B surface antigen. See, e.g., US- 
A-4,722,840. Constructs wherein the epitope is linked directly 
to the particle -forming protein coding sequences produce hybrids 
which are immunogenic with respect to the epitope. In addition, 
all of the vectors prepared include epitopes specific to H3V, 
having various degrees of immunogenicity, such as, for example, 
the pre-S peptide. 

In addition, portions of the part icle -forming protein coding 
sequence may be replaced with codons encoding an epitope of the 
invention. In this replacement, regions which are not required 
to mediate the aggregation of the units to form immunogenic 
particles in yeast or mammals can be deleted, thus eliminating 
additional HBV antigenic sites from competition with the epitope 
of the invention. 

Vaccines may be prepared from one or more immunogenic 
polypeptides of the invention. These polypeptides may be 
expressed in various host cells (e.g., bacteria, yeast, insect, 
or" mammalian cells) , or alternatively may be isolated from viral 
preparations or made synthetically. 

In addition to the above, it is also possible to prepare live 
vaccines of attenuated microorganisms which express one or more 
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recombinant polypeptides of the invention . Suitable attenuated 
microorganisms are known in the art and include, for example, 
viruses (e.g., vaccinia virus), as well as bacteria. 

The preDaration of vaccines which contain an immunogenic 

5 polypeptide (s) as active ingredients, is known to one skilled in 
the art. Typically, such vaccines are prepared as injectables, 
or as suitably encapsulated oral preparations and either liquid 
solutions or suspensions; solid forms suitable for solution in, 
or suspension in, liquid prior to injestion or injection may also 

10 be prepared. The preparation may also be emulsified, or the 
protein encapsulated in liposomes. The active immunogenic 
ingredients are often mixed with excipients which are 
pharmaceutically acceptable and compatible with the active 
ingredient. Suitable excipients are, for example, water, saline, 

15 dextrose, glycerol, ethanol, or the like and combinations 
thereof. In addition, if desired, the vaccine may contain minor 
amounts of auxiliary substances such as wetting or emulsifying 
agents, pH buffering agents, and/or adjuvants which enhance the 
effectiveness of the vaccine. Examples of adjuvants which may 

20 be effective include but are not limited to: aluminum hydroxide, 
N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP) , N-acetyl- 
nor-muramyl-L-alanyl-D-isoglutamine (CGP 11637, referred to as 
nor-MDP) , N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2- 
(1' -2' -dipalmitoyl-sn-glycero-3-hydroxyphosphoryloxy) -ethylamine 

25 (CGP 19835A, referred to as MTP-PE) , and RIBI, which contains 
three components extracted from bacteria, monophosphoryl lipid 
A, trehalose dimycolate and cell wall skeleton (MPL+TDM+CWS) in 
a 2% squalene/Tween® 80 emulsion. The effectiveness of an 
adjuvant may be determined by measuring the amount of antibodies 

30 directed against an immunogenic polypeptide containing an 
antigenic sequence resulting from administration of this 
polypeptide in vaccines which are also comprised of the various 
adjuvants . 

The vaccines are conventionally administered parenterally , by 
35 injection, for example, either subcutaneously or intramuscularly . 
Additional formulations which are suitable for other modes or 
administration include suppositories, oral formulations or as 
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enemas. For" suppositories, traditional binders and carried .ay 
include, for example, polyalkylene glycols or triglycerides , such 
supoositories nay be formed from mixtures concerning Che active 
ingredient in Che range of 0.5% co 10*. preferably 1% - 2%. Oral 

5 formulations include such normally employed excipiencs as, for 
example, pharmaceutical grades of mannitol lactose. starchy 
magnesium stearate. sodium saccharine, cellulose magnesium 
carbonate, and the like. These compositions take the form of 
solutions, suspensions, tablets-, pills, capsules, sustained 

,„ release formulations or powders and contain 10% - 95% of active 
ingredient, preferably 2S% - 70%. 

The proteins may be formulated into the vaccine as neutral or 
salt forms. Pharmaceutical^ acceptable salts include the acre 
addition salts (formed with free amino groups of the peptide, anc 

,5 which are formed with inorganic acids such as, for example^ 
bydrochloric or phosphoric acids, ^/ frmed 

acetic oxalic, tartaric, maleic, and uhe li*e. 
"""the free carboxyi groups may also be derived from inorganic 
bases such as, for example, sodium, potassium, ammonium, calcium, 

2 „ or rric hydroxides, and such organic bases as isopropyiamine^ 
trimechylamL, 2-ethylamino ethanol, histidine, procaine, ana 
the like. 

Th e vaccines are administered in a manner compatible with the 
dosage formulation, and in such amount as will be 
25 "I nylacticeuy and/or therapeutically 

to be administered, which is generally in the range ^o Spg 

of antigen per dose, depends on the subject to be treated, 
capacity of the subject's immune system to synthesize antibodies, 
ir oT aLinistration and the degree of P^^^T^ 
30 Precise amounts of active ingredient required to be 

m ay depend on the judgement of the practitioner and may 
peculiar to each subject. 
. Th e vaccine may be given in a s ingl e dose schedule preferably 
in a multiple dose scnedule. A , 
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required to maintain and or reenforce che immune response, for 
example, at 1-4 months for a second dose, and if needed, a 
subsequent dose(s) after several months. The dosage regimen will 
also, at least in part, be determined by the need of the 
individual and be dependent upon the judgement of the 
practitioner. 

In a further aspect of the invention, there is provided an 
attenuated vaccine comprising a normally pathogenic mycobacteria 
which harbours an attenuating mutation in any one of the genes 
encoding a polypeptide of the invention. The gene is selected 
from the group of ORFs A, B, C, D, E ( F, G and H, including the 
homologous ORFs B, C, E and F in Aftb. 

The mycobacteria may be used in the form of killed bacteria or 
as a live attenuated vaccine. There are advantages to a live 
attenuated vaccine. The whole live organism is used, rather than 
dead cells or selected cell components which may exhibit modified 
or denatured antigens. Protein antigens in the outer membrane 
will maintain their tertiary and quaternary structures. 
Therefore the potential to elicit a good protective long term 
immunity should be higher. 

The term "mutation" and the like refers to a genetic lesion in 
a gene which renders the gene non-functional. This may be at 
either the level of transcription or translation. The term thus 
envisages deletion of the entire gene or substantial portions 
thereof, and also point mutations in the coding sequence which 
result in truncated gene products unable to carry out the normal 
function of the gene. 

A mutation introduced into a bacterium of the invention will 
generally be a non-reverting attenuating mutation. Non-reverting 
means that for practical purposes the probability' of the mutated 
gene being restored to its normal function is small, for example 
less than 1 in 10 s such as less than 1 in 10 9 or even less than 
l in 10 12 . 
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An attenuated mycobacteria of the inversion may be in isolated 
form. This is usually desirable when the bacterium is to be used 
for the purposes of vaccination. The term "isolated" means that 
the bacterium is in a form in which it can be cultured, processed 
or otherwise used in a form in which it can be readily identified 
and in which it is substantially uncontaminated by other 
bacterial strains, for example non-attenuated parenc strains or 
unrelated bacterial strains. The term "isolated bacterium" thus 
encompasses cultures of a bacterial mutant of the invention, for 
example in the form of colonies on a solid medium or in the form 
of a liquid culture, as well as frozen or dried preparations of 
the strains. 

In a preferred aspect, the attenuated mycobacterium further 
comprises at least one additional mutation. This may be a 
mutation in a gene responsible for the production of products 
essential to bacterial growth which are absent in a human or 
animal host. For example, mutations to the gene for aspartate 
semi -aldehyde dehydrogenase (asd) have been proposed for the 
production of attenuated strains of Salmonella. The asd gene is 
described further in Gene (1993) 129; 123-128. A lesion in the 
asd gene, encoding the enzyme aspartate /?-semialdehyde 
dehydrogenase would render the organism auxotrophic for the 
essential nutrient diaminopelic acid (DA?) , which can be provided 
exogenously during bulk culture of the vaccine strain. Since 
this compound is an essential constituent of the cell wall for 
gram-negative and some gram-positive organisms and is absent from 
mammalian or other vertebrate tissues, mutants would undergo 
lysis after about three rounds of division in such tissues. 
Analogous mutations may be made to the attenuated mycobacteria 
of the invention. 

In addition or in the alternative, the attenuated mycobacteria 
may carry a recA mutation. The recA mutation knocks out 
homologous recombination - the process which is exploited for the 
construction of the mutations. Once the recA mutation has been 
incorporated the strain will be unable to repair the constructed 
deletion mutations. Such a mutation will provide attenuated 
strains in which the possibility of homologous recombination to 
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with DNA from wild- type strains has been minimized. RecA genes 
have been widely studied in the art and their sequences are 
available. Further modifications may be made for additional 
safety. 

5 The invention further provides a process for preparing a vaccine 
composition comprising an attenuated bacterium according to the 
invention process comprises (a) inoculating a culture vessel 
containing a nutrient medium suitable for growth of said 
bacterium; (b) culturing said bacterium; (c) recovering said 
10 bacteria and (d) mixing said bacteria with a pharmaceutical ly 
acceptable diluent or carrier. 

Attenuated bacterial strains according to the invention may be 
constructed using recombinant DNA methodology which is known per 
se. In general, bacterial genes may be mutated by a process of 

15 targeted homologous recombination in which a DNA construct 
containing a mutated form of the gene is introduced into a host 
bacterium which it is desired to attenuate. The construct will 
recombine with the wild- type gene carried by the host and thus 
the mutated gene may be incorporated into the host genome to 

20 provide a bacterium of the present invention which may then be 
isolated. 

The mutated gene may be obtained by introducing deletions into 
the gene, e.g by digesting with a restriction enzyme which cuts 
the coding sequence twice to excise a portion of the gene and 

25 then religating under conditions in which the excised portion is 
not reintroduced into the cut gene. Alternatively frame shift 
mutations may be introduced by cutting with a restriction enzyme 
which leaves" overhanging. 5.'... ;^nd .3/ termini, filling in and/or 
trimming back the overhangs, and religating. Similar mutations 

30 may be made by site directed mutagenesis. These are only - 
examples of the types of techniques which will readily be at the 
disposal of those of skill in the art. 

Various assays are available to detect successful recombination. 
In the case of attenuations which mutate a target gene necessary 
35 for the production of an essential metabolite or catabolice 
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compound, selection may be carried out by screening for bacteria 
unable to grow in the absence of such a compound. Bacteria may 
also be screened with antibodies or nucleic acids of the 
invention to determine the absence of production of a mutated 
gene product of the invention or to confirm that the genetic 
lesion introduced - e.g. a deletion - has been incorporated into 
the genome of the attenuated strain. 

The concentration of the attenuated strain in the vaccine will 
be formulated to allow convenient unit dosage forms to be 
prepared. Concentrations of from about 10* to 10 9 bacteria per 
ml will generally be suitable, e.g. from about 10 s to 10 8 such as 
about 10 s per ml. Live attenuated organisms may be administered 
subcutaneously or intramuscularly at up to 10 3 organisms in one 
or more doses, e.g from around 10 5 to 10 8 , e.g about 10 5 or 10 7 
organisms in a single dose. 

The vaccines of the invention may be administered to recipients 
to treat established disease or in order to protect them against 
diseases caused by the corresponding wild type mycobacteria, such 
as inflammatory diseases such as Crohn's disease or sarcoidosis 
in humans or Johne's disease in animals. The vaccine may be 
administered by any suitable route. In general, subcutaneous or 
intramuscular injection is most convenient, but oral, intranasal 
and colorectal administration may also be used. 

The following Examples illustrates aspects of the invention. 
EXAMPLE 1 

Tests for the presence of the GS identifier sequence were 
performed on 5(il bacterial DNA extracts (25 jig/ml to 500 ng/ml) 
using polymerase chain reaction based on the oligonucleotide 
primers 5 ' -GATGCCGTGAGGAGGTAAAGCTGC-3 ' (Seq ID No. 40) and 5'- 
GATACGGCTCTTGAATCCTGCACG-3 ' (Seq ID No. 41) from within the 
identifier DNA sequences (Seq. ID Nos 1 and 2) . PCR was performed 
for 40 cycles in the presence of 1.5 mM magnesium and an 
annealing temperature of 58°C. The presence or absence of the 
correct amplification product indicated the presence or absence 
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of GS identifier sequence in the corresponding bacterium. GS 
identifier sequence is shown to be present in all the laboratory 
and field strains of Mptb and Mavs tested. This includes Mptb 
isolates 0025 (bovine CVL Weybridge) , 0021 (caprine, Moredun) , 
5 0022 (bovine, Moredun), 0139 (human, Chiodini 1984), 0209, 
0208, 0211, 0210, 0212, 0207, 0204, 0206 (bovine, Whipple 1990) . 
All Mptb strains were IS900 positive. The Mavs strains include 
0010 and 0012 (woodpigeon, Thorel) 0018 (armadillo, Portaels) and 
0034, 0037, 0038, 0040 (AIDS, Hoffner) . All Mavs strains were 

10 IS902 positive. One pathogenic M. avium strain 0033 (AIDS, 
Hoffner) also contained GS identifier sequence. GS identifier 
sequence is absent from other mycobacteria including other 
M. avium, M.malmoense, M.szulgai, M.gordonae, M.chelonei, 
M.fortuitum, M.phlei, as well as E.coli, S.areus, Nocardia sp, 

15 Streptococcus sp. Shigella sp. Pseudomonas sp. 

Example 2 : 

To obtain the full sequence of GS in Mavs and Mptb we generated 
a genomic library of Mavs using the restriction endonuclease 
EcoRI and cloning into the vector pUC18. This achieved a 

20 representative library which was screened with 32 P-labelled 
identifier sequence yielding a positive clone containing a 17kbp 
insert. We constructed a restriction map of this insert and 
identified GS as fragments unique to Mavs and Mptb and not 
occurring in laboratory strains of M. avium. These fragments 

25 were sub-cloned into pUC18 and pGEM4Z. We identified GS 
contained within an 8kb region. The full nucleotide sequence 
was determined for GS on both DNA strands using primer walking 
and automated DNA sequencing. DNA sequence for GS in Mptb was 
obtained using overlapping PCR products generated using PwoDNA 

30 polymerase, a proofreading thermostable enzyme. The final DNA 
sequences were derived using the University of Wisconsin GCG gel 
assembly software package. 

Example 3 : 

The DNA sequence of GS in Mavs and Mptb was found to be more 
35 than 99% homologous. The ORFs encoded in GS were identified 
using GeneRunner and DNAStar computer programmes. Eight ORFs 
were identified and designated GSA, GS3, GSC, GSD, GSE, GSF, GSG 
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and GSH. Database comparisons were carried out againsc the 
GenEMEL Database release version 48.0 (9/96) , using the BLAST and 
BLIXEM orogrammes. GSA and GSB encoded proteins of 13.5kDa and 
30. 7kDa" respectively, both of unknown functions. GSC encoded 
a protein of 38.4kDa with a 55% homology to the amino acid 
sequence of rfbD of V.cholerae, a 62% amino acid sequence 
homology to gmd of E.coli and a 58% homology to gca of 
Ps. aeruginosa which are all GDP-D-mannose dehydratases. 
Equivalent gene products in H. influenzae, S. dysenteriae, 

Y enterocolitica, N . gonorrhoea , K . pneumoniae and rfbD in 
Salmonella enterica are all involved in '0' -antigen processing 
known to be linked to pathogenicity. GSD encoded a protein of 
37 IkDa which showed 58% homology at the DNA level to wcaG from 
E coU a gene involved in the synthesis and regulation or 
capsular polysaccharides, also related to pathogenicity. GSE 
was found to have a > 30% amino acid homology to rfbT of 

V cholerae, involved in the transport of specific LPS components 
across the cell membrane. In V.cholerae the gene product causes 
a seroconversion from the Inaba to the Ogawa 'epidemic' strain. 

GSF encoded a protein of 30.2kDa which was homologous in the 
range 25-40% at the amino acid level to several glucosyi 
transferases such as rfpA of K.pneumoniae, rfbB of K . pneumoniae , 
lgtD of H.influenzae, lsi of N.gonorrhoae. In E.coli an 
equivalent gene galE adds 0-1-3 N-acetylglucosamine to galactose, 
tre Matter only found in '0' and 'M' antigens which are also 
related to pathogenicity. GSH comprising the ORFs GSH, and GSH 2 
encodes a protein totalling about 60kDa which is a putative 
transoosase with a 40 - 43% homology at the amino acid level to 
the e'cuivalent gene product of IS21 in E.coli. This family of 
insertion sequences is broadly distributed amongst gram negative 
bacteria and is responsible for mobility and transposition of 
genetic elements. An IS21- like element in B.fragilis is split 
eitb- side of the ^-lactamase gene controlling its activation 
and expression. We programmed an E.coli S30 cell-free extract 
with oiasmid DNA containing the ORF GSH under the control of a 
lac urometer in the presence of a »S -methionine, and 
demonstrated the translation of an abundant 60kDa protein. 
The oroteins homologous to GS encoded in other organisms are in 
general highly antigenic. Thus the proteins encoded by the OfL-s 
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in GS may be used in immunoassays of antibody or cell mediated 
immuno-reactivity for diagnosing infections caused by 
mycobacteria, particularly Mptb, Mavs and Mtb. Enhancement of 
host immune recognition of GS encoded proteins by vaccination 

5 using naked specific DNA or recombinant GS proteins, may be used 
in the prevention and treatment of infections caused by Mptb, 
Mavs and Mtib in humans and animals. Mutation or deletion of all 
or some of the ORFs A to H in GS may be used to generate 
attenuated strains of Mptb, Mavs or Mtb with lower pathogenicity 

10 for use as living or killed vaccines in humans and animals. Such 
vaccines are particularly relevant to Johne's disease in animals, 
to diseases caused by Mptb in humans such as Crohn's disease, and 
to the management of tuberculosis especially where the disease 
is caused by multiple drug-resistant organisms. 
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SEQUENCE LISTING 

Seq. 10 No.l 

5'- 1 GATCCAACTA AACCCGATGG AACCCCGCGC AAACTATTGG ACGTCTCCGC GCTACGCAGT 
61 TGGGTTGGCG CCCGCGAATC GCACTGAAAG AGGGCATCGA TGCAACGGTG TCGTGGTACC 
121 GCACAAATGC CGATGCCGTG AGGAGGTAAA GCTGCGGGCC GGCCGATGTT ATCCCTCCGG 
181 CCGGACGGGT AGGGCGACCT GCCATCGAGT GGTACGGCAG TCGCCTGGCC GGCGAGGCGC 
241 ATGGCCTATG TGAGTATCCC ATAGCCTGGC TTGGCTCGCC CCTACGCATT ATCAGTTGAC 
301 CGCTTTCGCG CCACGTCGCA GGCTTGCGGC AGCATCCCGT TCAGGTCTCC TCATGGTCCG 
361 GTGTGGCACG ACCACGCAAG CTCGAACCGA CTCGTTTCCC AATTTCGCAT GCTAATATCG 
421 CTCGATGGAT TTTTTGCGCA ACGCCGGCTT GATGGCTCGT AACGTTAGCA CCGAGATGCT 
481 GCGCCACTCC GAACGAAAGC GCCTATTAGT AAACCAAGTC GAAGCATACG GAGTCAACGT 
541 TGTTATTGAT GTCGGTGCTA ACTCCGGCCA GTTCGGTAGC GCTTTGCGTC GTGCAGGATT 
601 CAAGAGCCGT ATCGTTTCCT TTGAACCTCT TTCGGGGCCA TTTGCGCAAC TAACGCGCAA 
661 GTCGGCATCG GATC -3 - 



15 Seq. ID No. 2 

S'- 1 GATCCGATGC CGACTTGC3C GTTAGTTGCG CAAATGGCCC CGAAAGAGGT TCAAAGGAAA 
61 CGATACSGCT CTTGAATCCT GCACGACGCA AAGCGCTACC GAACTGGCCG GAGTTAGCAC 
121 CGACATCAAT AACAACGTTG ACTCCGTATG CTTCGACTTG G77TACTAAT AGGCGCTTTC 
181 GTTCGGAGTG GCGCAGCATC TCGGTGCTAA CGTTACGAGC CATCAAGCCG GCGTTGCGCA 

20 241 AAAAATCCAT CGAGCGATAT TAGCATGCGA AA7TGGGAAA CGAGTCGGTT CGAGCTTGCG 

301 TGGTCGTGCC ACACCGGACC ATGAGGAGAC CTGAACGGGA TGCTGCCGCA AGCCTGCGAC 
361 GTGGCGCGAA AGCGGTCAAC TGATAATGCG 7AGGGGCGAG CCAAGCCAGG CTATGGGATA 
421 CTCACATAGG CCATGCGCC7 CGCCGGCCAG GCGACTGCCG TACCACTCGA TGGCAGGTCG 
481 CCCTACCCGT CCGGCCGGAG GGATAACATC GGCCGGCCCG CAGCTTTACC TCCTCACGGC 

25 541 ATCGGCATTT GTGCGGTACC ACGACACCGT TGCATCGATG CCCTCTTTCA GTGCGATTCG 

601 CGGGCGCCAA CCCAACTGCG TAGCGCGGAG ACGTCCAATA GTTTGCGCGG GGTTCCATCG 
661 GGTTTAGTTG GATC -3' 
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Seq. ID No. 3 

1 • GAATTCTGGG 77GGAGACGA CGTCGAACTC C7GG7CGGTC 77GC77CGAA 

51 7GA7CGC7G7 GA7C7GG7CG GCGGTGCCGA CAGGAACCGT CGACTTGTCG 

101 ACGA7CACC7 TGTACCGGTC GATGTATGAC CCAATGTCGT CCGCAACCGA 

5 151 GAAGACGTAC GTCAGGTCCG CCGCCCCGCT TTCACCCATG GGCGTCGGGA 

201 CGGCGATGAA AA7GACG7CC GCGTGCTCGA TTCCGCGTTG CCGGTCGGTG 

2 SI GTGAAGTCAA 7CAGCCCGTT CTCACGGTTC C7CGCAATCA ACTCCCAACC 

301 CGGGCTCGAA AATCGGGACA CTGCCTGCGA GGAGCAAA7C GATCTTGGCC 

351 TGATCGATAT CGACACAGAC GACATCGTTG CCGCTATCCG CGAGACAGGC 

10 401 GCCCGTGACG AGGCCTACAT AGCCTGATCC GACCACCGAA ATTTTCAAGA 

451 TGACCCCTTC AAGTCCCCGA TCGGTCGACG ACCATACTGC CGCAACTCTG 

501 TACCCTCCGT GGGTAATTCG CATGTCGCGT TCGTAAGGAG CAGCCAGCGA 

551 GTCGGGGACG TTCGG7GAGA GAGTCGCAGG ACTACGAGGT TGCCGGTGCG 

601 ATACATCACA G7G7TGCG7C TGTCGGCAAC GATGCAGCAA GAACCCACGG 

15 651 GGCAGCCCTG AACTGCGCGC ATGACCGGTC CTTGTCCTGG CACCTTTGAT 

701 CGGCCACCGC TTCCATGCGA ACATGACCGG AATCCATAGC GCGTGGTCAA 

751 GCAGCGGGGA GG7AGACG7C GGTGTCATCT GCTCCAACCG TGTCGGTGAT 

801 AACGATTTCG CTGAACGATC TCGAGGGAT7 GAAAAGCACC GTGGAGAGCG 

851 7TCGCGCGCA GCGCTATGGG GGGCGAATCG AGCACATCGT CATCGACGGT 

20 901 GGATCGGGCG ACGCCGTCGT GGAGTATCTG TCCGGCGATC CTGGCTTTGC 

9S1 ATATTGGCAA TCTCAGCCCG ACAACGGGAG ATATGACGCG ATGAATCAGG 

1001 GCATTGCCCA TTCGTCGGGC GACC7GTTG7 GGTTTATGCA CTCCACGGAT 

1051 CGTTTCTCCG ATCCAGATGC AGTCGCTTCC G7GG7GGAGG CGC7C7CGGG 

1101 GCA7GGACCA G7ACG7GA77 7G7GGGG77A CGGGAAAAAC AACC7TG7CG 

25 HSl GAC7CGACGG CAAACCAC77 77CCC7CGGC CG7ACGGC7A 7A7GCCG777 

1201 AAGA7GCGGA AA777C7GCT CGGCGCGACG G77GCGCA7C AGGCG ACA77 

1251 C77CGGCGCG 7CGC7GG7AG CCAAG77GGG CGG7TACGA7 C77GA777TG 

1301 GAC7CGAGGC GGACCAGC7G 7TCA7C7ACC G7GCCGCAC7 AA7ACGGCC7 

1351 CCCG7CACGA 7CGACCGCG7 G GT T TG CGAC 77CGA7G7CA CGGGACC7GG 

30 1401 77CAACCCAG CCCA7CCG7G AGCAC7A7CG GACCC7GCGG CGGC7C7GGG 

14 51 ACCTGCA7GG CGAC7ACCCG C7GGG7GGGC GCAGAG7GTC G7GGGC77AC 

1501 77GCG7G7GA AGGAG7AC77 GA77CGGGCC GACC7GGCCG CATTCAACGC 

1551 GG7AAAG7TC 7TGCGAGCGA AG77CGCCAG AGC77CGCGG AAGCAAAA77 

1601 CATAGAAACC AACT7C7ACT GCCTGACC7G AGCAGCGCCG AGGCGCGCAG 

35 1651 CGCGATCAG7 GCGACC7GAA CGGCCAGG7G GAAAGCGCCA CCGA7CCCGG 

1701 CACCGAGTGC C7GACGCT7C GGA7CCC77G CACCACAACG AGAG7GAGAG 

1751 CGCCATGA7G AGGAAATA7C GGCTGGGCGG AG7CAACGCC GGAG7GACAA 

1801 AAGTGAGAAC CCGGTGAAGC GAGCGC77A7 AACAGGGA7C ACGGGGCAGG 

1851 A7GGTTCCTA CC7CGCCGAG CTAC7AC7GA GCAAGGGA7A CGAGG7TCAC 

40 1901 GGGCTCGTTC G7CGAGC7TC GACG777AAC ACG7CGCGGA 7CGA7CACC7 

1951 C7ACGTTGAC CCACACCAAC CGGGCGCGCG C7TG7TC77G CAC7A7GCAG 

2001 ACCTCAC7GA CGGCACCCGG TTGGTGACCC 7GC7CAGCAG 7A7CGACCCG 

20 SI GATGAGG7C7 ACAACCTCGC AGCGCAG7CC CA7G7GCGCG 7CAGCT7TGA 

2101 CGAGCCAGTG CA7ACCGGAG ACACCACCGG CA7GGGA7CG A7CCGAC77C 

45 2151 7GGAAGCAG7 CCGCC7TTCT CGGG7GGAC7 GCCGG77C7A 7CAGGC77CC 

2201 7CG7CGGAGA 7G77CGGCGC A7C7CCGCCA CCGCAGAACG AA7CGACGCC 

2251 G7TCTATCCC CG77CGCCA7 ACGGCGCGGC CAAGG7CT7C 7CG7AC7GGA 

2301 CGACTCGCAA C7A7CGAGAG GCG7ACGGA7 7A77CGCAG7 GAA7GGCA7C 

2351 7TGT7CAACC A7GAG7CCCC CCGGCGCGGC GAGACT77CG 7GACCCGAAA 

50 2401 GA7CACGCG7 GCCG7GGCGC GCATCCGAGC 7GGCG7CCAA 7CGGAGG7C7 

24 51 A7A7GGGCAA CC7CGA7GCG A7CCGCGAC7 GGGGC7ACGC GCCCGAA7A7 

2501 G7CGAGGGGA 7G7GGAGGA7 G7TGCAAGCG CC7GAACC7G A7GAC7ACG7 
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3201 

3251 

3301 
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3451 

3501 
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4551 
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4651 
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4801 

4951 

4901 

4951 

5001 

5051 
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CC7GGCGACA 
TTGACCATGT 
TATTTGCGTC 
GGCCCAGTCA 
GCATCA7GGT 
TGGATCGACA 
ACACCTGGGC 
GGGGCTGGTC 
CCAATCTCAT 
GCAACGTTTG 
GGCCGCACGG 
TCTTGTCCGA 
GCCGTGCGTG 
GAAGTACGCT 
TGGAGCCCAC 
CAAGTTCAGG 
GCCGACTAAC 
ATCTCTTGCC 
GCAGAAGAGG 
GCATGTCGAC 
ATGGTCCGAA 
GAGATCGCAG 
TTGGGATCCA 
CCGCGCTACG 
ATCGATGCAA 
GTAAAGCTGC 
GACCTGCCGT 
CTATGGGAGT 
TTGACCGCTT 
TCTCC7CATG 
TTCCCAATTT 
GGCTTGATGG 
AAAGCGCCTA 
TTGATGTCGG 
GGATTCAAGA 
GCAACTAACG 
ATGCCCTAGG 
GCGGGGGCAA 
CTTTCCTCCC 
TTGATTCGGT 
AAGATCGACG 
AACGCTTAAC 
CGTTGTACGA 
TCCCTAGGTT 
CAATGGTCGA 
GACATAAATG 
TGAGCCGGCC 
GACGTGCGGC 
CTGCGCCAGT 
CTGCAAGCCT 
AGTGGTCCTT 
ACAG7T7CCG 
GATGATGGCC 



GGGCGTGGTT 
CGGGCTCGAC 
CCACCGAGGT 
CTCGGCTGGA 
GGACGCGGAC 
CGCCGATGTT 
CTCTGGACCG 
GGCTCAGCGC 
7G7GCGA7CA 
ATTTTG7GTC 
GTCGGCGGCA 
AAACCTCCGA 
7GCCGCGGC7 
CCGCAACCTA 
CAACGACGCG 
CGGTTAGGCG 
CTCTACGGAC 
GGCGCTCA7C 
TGACGAATTG 
GATCTGGCGA 
CCACGTCAAC 
ACATGG7CGC 
ACTAAACCCG 
CGAGTTGGGT 
CGGTGTCGTG 
GGGTCGGCCG 
CGAGTGGTAC 
ATCCAATAGC 
TCGCGCCAGC 
G7CCGG7G7G 
CGCA7GC7AA 
CTCGTAACGT 
TTAGTAAACC 
7GC7AAC7CC 
GCCG7A7CG7 
CGCAAGTCGG 
CGACGCCGAT 
GTAGTTCCGT 
GCGAATTATA 
TGCATCAGAA 
TACAGGGTTT 
GAAAGC7GCG 
AGGTGACATG 
TCAGACTGAC 
ATGCTTCAAG 



CTCCGTCGGC 
7CCCGGGCAC 
ACGAACAGGT 
GTTCTCGATA 
GCCTCGGAAG 
GTCGACGGCG 
CCCGGAAC7C 
CCTACGACGC 



ACACCGTACG 
TGGCAAAAGC 
CG ATT CG CT A 
AAGCTTCGGT 
ATCGCCGCGT 
GCC7GGTTGG 
CGCAACGCCC 
TCGTACGTAG 
CGCGATGAGA 
TGAGACAAGA 
TCATGGCGAA 
ATCCAGACCA 
CC7TTTCCTC 
TCCACGAGAG 
TATGCGATCG 
CCAATATGGG 
CCGGCGACAA 
CGTCGATATG 
GGGGACCGG7 
GCGCA7GCC7 
G7GGGCACCG 
TACAGCGGTG 
ATGGAACCCC 
TGGCGCCCGC 
GTACCGCACA 
ATGT7ATCCC 
GGCAGTCGCC 
CTGGC7TGGC 
7CGCAGGCTT 
GCACGACCAC 
TATCGCTCGA 
TAGTACCGAG 
AATTCAAAGC 
GGCCAGTTCG 
TTCCTT7GAA 
CATCGGATCC 
GAGACGATTA 
GCTGCCGATG 
TTGGCACCGA 
TTTCTGAACC 
CGAGAAGCAG 
TCGGCATGCA 
CTGATTCATG 
GGGTTTGiTG 
CTGACGGCAT 
ACCCTGCCGG 
C7AATCGAC7 
GGCCGGCTGC 
ATTATCCC7A 
CATCGTCGGG 
GTTCGACCGA 
GGC7CGCGAC 
CATGAACCGC 



7GAGTTCGC7 
GCGTCAAG7T 
GTAGGAGA7G 
7CA7AC7GG7 
7GGAG7GCGA 
GGCAGAG7AA 
G7G7A7A7CG 
A777GAGGCC 
77GA7C7GAC 
CCACAGGTGA 
7AACACC7A7 
A777GC7CGA 
GG77CG7CA7 
7GC77TA77G 
CCAAGA7CGC 
CTGGCGTGGA 

cttctccccg 

AGGAAGCCAA 
AC7CCGCGGC 
G7TCCTT77G 
GCG7CGA7CA 
GGC7ACA7CG 
GCGCAAAC7A 
GAA7CGCAC7 
AA7GCCGA7G 
7CCGGCCGGA 
TGGCCGGCGA 
7CGCCCC7AC 
GCGGCAGCA7 
GCAAGC7CGA 
7GGA777TT7 
A7GC7GCGCC 
A7ACGGAG7C 
G7AGCGC77T 
CCTCTT7CGG 
ACTATGGGAG 
CCATCAA7G7 
CTTAAAAG7C 
AGACG77GCA 
C7ACCGA7G7 
G7TATCACGG 
ACTCGAACTT 
AAGCGC77GA 
CCCGGC7T7A 

7A7CCAAACG 
A7C7AAA77G 
7AGCG7TACA 
CC77CAA7GC 
CAGACC7ACC 
7CGGACCC7C 
7GG7CG77CA 
GGCG7C 



CAAGC7GC77 
7GACGACCGC 
CCGACAAGGC 
GAAC7CGCGC 
7GGCACACCA 
GTTGACGACT 
CCGG7CA7CG 
GAGGGGT7CA 
GGACCGAGCC 
7CA7CGA7GC 
CCCGCGGAC7 
CGCAGC7G7C 
GCA7C7ACCC 
ACTGGCCC77 
CGG7A7CCTG 
7CTC7GCGA7 
7CCGGG7CGC 
AGC7GG7GG7 
GCGAAC7TC7 
GAACA777CG 
CAGCA77AGC 
GCGAAACACG 
77GGACGTC7 
GAAAGACGGC 
CCG7GAGGAG 
CGGG7GGGGC 
GGCGCGTGGC 
GCA77A7CAG 
CCCG77CAGG 
ACCGACTCG7 
GCGCAACGCC 
ACT7CGAACG 
AACG77G77A 
GCG7CG7GCA 
GGCCA7T7GC 
7GTCACCAG7 
GGCAGGCAA7 
A7CAAGA7GC 
A7ACACCGCC 
7ACT77CC7G 
GCAG7AAG7C 
7CTT77AT7C 
ACTTG7C7AT 
CGGA7CCGCG 
GGGGACGA77 
GGCGA7C7GG 
AGGCGGCCGC 
CACG7CA7GA 
AGCGG7GACG 
GGGAAG7GGA 
GACATCGCGA 
CAGCGGGCCC 
7GGCCACAGG 
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5201 CGAA7GGGTA CTTTTTTTAG GCGCCGACGA CACCC7C7AC GAACCAACCA 

5251 CGTTGGCCCA GGTAGCCGCT 77TCTCGGCG ACCA7GCGGC AAGCCATCTT 

5301 GTCTATGGCG ATGTTGTGAT GCGTTCGACG AAAAGCCGGC ATGCCGGACC 

53 51 TTTCGACCTC GACCGCCTCC TA7TTGAGAC GAATTTGTGC CACCAATCGA 
5 S4 01 TCTTTTACCG CCG7GAGC77 TTCGACGGCA TCGGCCCTTA CAACC7GCGC 

54 51 TACCGAGTCT GGGCGGACTG GGACTTCAAT ATTCGCTGCT TCTCCAACCC 
5501 GGCGCTGATT ACCCGCTACA TGGACGTCGT GATTTCCGAA TACAACGACA 
5551 TGACCGGC7T CAGCATGAGG CAGGGGAC7G ATAAAGAGTT CAGAAAACGG 
5601 CTGCCAATGT A C T T C TG GGT TGCAGGGTGG GAGACTTGCA GGCGCATGCT 

10 5651 GGC GTTTTTG AAAGACAAGG AGAATCGCCG 7CTGGCCTTG CGTACGCGGT 

S701 TGATAAGGGT TAAGGCCGTC TCCAAAGAAC GAAGCGCAGA ACCGTAGTCG 

5751 CGGATCCACA TTGGACTTCT TTAACGCGTT TGCGTCCTGA TCCACCTTTC 

5801 AAGCCCGTTC CGCGTAACGC GGCGCGCAGA GAGTGGTCGC ATATCGCATC 

5851 ACTGTTCTCG TGCCAGTGCT TGGAAAGCGT CGAGCACTCT GGTTCGCGTT 

15 5901 CTTGACGTTC GCGCCCGCTC CTAGAGGTAG CGTG7CACGT GACTGAAGCC 

5951 AATGAGTGCA ACTCGGCGTC GCGAAAGGTT TCAGTCGCGG TTGAGCAAGA 

6001 CACCGCAAGA CTACTGGAGT GCGTGCACAA GCGCCTCCAG CTCGCGGCTG 

6051 AAAGCGGATG CAAAGGGATT CGAAGCTTGA GCAACATGCG AAGGGGAGAA 

6101 CGGCCTATGA GGCTGGGACA GGTTTTCGAT CCGCGCGCGA ATGCACTGTC 

20 6151 AATGGCCAAG TAGAAGTCCC CGCTGGTGGC CAGCAGAAGT CCCCACTCCG 

6201 CTGCGGGTGG TTGGCTAATT C77GGCGGCT CCCTTCTTGT GGTCGGCGTG 

6251 GCGCATCCGG TAGGACTCGC CGGAGGTGAC GACGATGCTG GCGTGGTGCA 

6301 GCAGCCGA7C GAGGA7GC7G GCGGCGGTGG TG7GCTCGGG CAGGAATCGC 

63 51 CCCCATTGTT CGAAGGGCCA ATGCGAGGCG ATGGCCAGGG AGCGGCGCTC 

25 6401 GTAGCCGGCA GCCACGAGCC GGAACAACAG TTGAGTCCCG GTGTCGTCGA 

6451 GCGGGGCGAA GCCGATCTCG TCCAAGATGA CCAGATCCGC GCGGAGCAGG 

6501 GTGTCGATGA TCTTGCCGAC GGTGTTGTCG GCCAGGCCGC GGTAGAGGAC 

6551 CTCGATCAGG TCGGCGGCGG TGAAGTAGCG GAC7TTGAA7 CCGGCGTGGA 

6601 CGGCAGCGTG CCCGCAGCCG ATGAGCAGGT GACTTTTGCC CGTACCAGGT 

30 6651 GGGCCAATGA CCGCCAGGTT CTGTTGTGCC CGAATCCATT CCAGGCTCGA 

6701 CAGGTAGTCG AACGTGGCTG CGGTGATCGA CGATCCGGTG ACGTCGAACC 

6751 CG7CGAGGG7 CTTGGTGACC GGGAAGGCTG CGGCCTTGAG ACGGTTGGCG 

6801 GTGTTGGAGG CATCGCGGGC AGCGATCTCG GCCTCAACCA ACGTCCGCAG 

6851 GATCTCCTCC GG7GTCCAGC G7TGCGTCTT GGCGACTTGC AACACCTCGG 

35 6901 CGGCGTTGCG GCGCACCGTG GCCAGCTTCA ACCGCCGCAG CGCCGCG7CA 

6951 AGGTCAGCAG CCAGCGGTGC CGCCGAGGAC GGTGCCACCG GCTTGGCAGC 

7001 GGTGGTCATG AGGCCGTCCC GTCGGTGGTG TTGATC7TGT AGGCCTCCAA 

7051 CGAGCGGGTC TCGACGGTGG GCAGATCGAG CACGAGTGCG TCGCCGGCGG 

7101 GGCGGGGTTG TGGGGTGCCG GCGCCGGCGG CCAGGATCGA GCGCACGTCG 

40 7151 GCAGCGCGGA ACCGGCGAAA CGCAACCGCC CGGCGCAGCG CGTCAATCAA 

7201 AGCCTGTTCG CCGTGGGCGG CGCCAAGGCC GAGCAGAATG TCGAGTTCGG 

7251 ATTTCAGTCG GGTGTTGCCG A7CGCAGCAG CACCGACGAG GAAC7GC7GC 

7301 GCTTCGGTTC CCAATGCGCA GAA7CG7T7C 7CTGC7TGGG 7T7TCGGGCG 

7351 AGGACCACGC GAGGG7GCGG G7C7GGG7CC G7CG7AGTG7 7CA7CGAGGA 

45 7401 7GGACACC7C ACCTGGGCTG ACGAGC7CG7 GCTCGGCCAC GA7CACACCG 

7451 G7CGCAGGT7 CCAACAGGA7 CAGGGCGCCA 7GA7CGACCA CCACCGCCAC 

7501 GGTGGCACCG ACGAGCCGCt GAGGCACCGA G7AACGAGC7 GAGCCGTAAC 

7551 GGA7GCACGA GAGGCCG7CG ACCT7ACGGC GCACCGACCC CGAGCCGA7C 

7601 G7CGGCCGCA GCGAGGGCAG CTCCCTCAAG ACGGTGCGC? CG7CAACCAA 

50 7651 GCGA7CGTTG GGCACGGCGC AGA7C7CCGA G7GGACCG7G GCA77GACC7 

7701 CGGCGCACCA 7AGTTGCGCC 7GGGCG77GA GGGCACG7AG GTCGACCTGC 

7751 7CACCGGC7A ACGCAGC77C GG7CAGCAGC GGCACCGCAA GG7CG7CCTG 

7801 AGCGTAGCCA CAGAGG77C7 CCACGA7GCC CT7C3A7TGC GGA7CCGCAC 
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78 Si CGTGGCAGAA G7CCGGAACG 
7901 TCCGGTGTTG GAACAACAAC 
7951 CATCCGGTCG GCGAGGA7C7 
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AAGCCA7AG7 GGGACCCGAA 7CGCACA7AA 
ATTGGCGACG ACACCACCTT TGAGGCAGCC 
TGGCCGGAAC CCCACCGA7C GCC7C 



Seq. ID No. 4 



1 TTCTACTGCC TGACCTGAGC AGCGCCGAGG CGCGCAGCGC GATCACTGCG ACCTGAATGG 
61 CCAGGTGGAA AGCGCCACCG ATCCCGGCAC CGAGTGCCTG ACGAT7CGGA TCCCTTGCAC 
121 CACAACGAGA G7GAGACCGC CATGATGACG AAA7ATCGGC TGGGCGGAGT CAACGCCGGA 
181 GTGACAAAAG TGAGAACCCG G7GAAGCGAG CGCTTATAAC AGGGATCACG GGGCAGGA7G 
241 GTTCCTACCT CGCCGAGCTA CTACTGAGCA AGGGATACGA GGTTCACGGG CTCGTTCGTC 
301 GAGCTTCGAC GTT7AACACG TCGCGGATCG ATCACCTCTA CGTTGACCCA CACCAACCGG 
361 GCGCGCGCTT GTTC7TGCAC 7A7GCAGACC TCACTGACGG CACCCGGTTG GTGACCCTGC 
42i TCAGCAGTAT CGACCCGGAT GAGGTCTACA ACCTCGCAGC GCAGTCCCAT GTGCGCGTCA 
481 GCTTTGACGA GCCAGTGCAT ACCGGAGACA CCACCGGCAT GGGATCGATC CGAC77C7GG 
541 AAGCAGTCCG CCTTTCTCGG GTGGACTGCC GGTTCTATCA GGCTTCCTCG 7CGGAGA7G7 
601 TCGGCGCA7C 7CCGCCACCG CAGAACGAA7 CGACGCCG77 CTA7CCCCG7 7CGCCA7ACG 
661 GCGCGGCCAA GG7C77C7CG 7AC7GGACGA C7CGCAAC7A 7CGAGAGGCG 7ACGGA77A7 
721 TCGCAGTGAA TGGCA7CTTG TTCAACCA7G AG7CCCCCCG GCGCGGCGAG ACTT7CG7GA 
781 C-GAAAGA7 CACGCG7GCC GTGGCGCGCA 7CCGAGC7GG CG7CCAA7CG GAGG7CTA7A 
841 TGGGCAACCT CGA7GCGATC CGCGACTGGG GCTACGCGCC CGAA7ATG7C GAGGGGA7G7 
01 GGAGGATGTT GCAAGCGCC7 GAACC7GA7G AC7ACG7CC7 GGCGACAGGG CGTGGT7ACA 
1 cIgtIcSgA GTTCGC7CAA GC7GC77T7G ACCACG7CGG GC7CGAC7GG CAAAAGCACG 
12 TCaI^TGA CGACCGC7A7 TTGCGCCCCA CCGAGG7C3A 77CGC7AG7A GGAGA7GCCG 

8 IcA^GC CCAG7CAC7C GGC7GGAAAG CITCGG77CA 7AC7GG7GAA CTCGCGCGCA 
1 TCA7GG7GGA CGCGGACA7C GCCGCG7CGG AG7GCGA7GG CACACCA7GG A7CGACACGC 

01 CGA7G773CC 7GG77GGGGC GGAG7AAGT7 GACGAC7ACA CCTGGGCCTC 7GGACCGCGC 
1261 SSgC=G7G 7A7A7CGCCG G7CA7CGGGG GC7GG7CGGC 7CAGCGCTCG TACG7AGA7T 
132 TGAGGCCGAG GGG7TCACCA A7C7CA77G7 GCGATCACGC GA7GAGA77G A7C7GACGGA 

CCG^GCCGGA ACGT77GA7T 77G7G7C7GA GACAAGACCA CAGG7GA7CA 7CGA7GCGGC 
CGcTc^C GGCGGCA7CA 7GGCGAA7AA CACC7A7CCC GCGGAC77C7 TG7CCGAAAA 
1501 CCTCCGAATC CAGACCAA7T 7GC7CGACGC AGCTG7CGCC G7GCG7G7GC CGCGGC7C.. 
ScC7CGG7 7CG7CA7GCA 7CTACCCGAA GTACGCTCCG CAACC7A7CC ACGAGAG7GC 
StATTGACT GGCCCrrrGG AGCCCACCAA CGACGCG7AT GCGA7CGCCA AGA7CGCCGG 
168 tItcSgS GTTCAGGCGG TTAGGCGCCA A7ATGGGC7G GCG7GGA7CT CTGCGA7GCC 
7 SSScSc 7ACGGACCCG GCGACAACTT CTCCCCG7CC GGGTCGCATC TCTTGCC^GC 
lloi rCTCATCCG7 CGA7A7GAGG AAGCCAAAGC TGG7GG7GCA GAAG^GTGA OGAAT7GG.G 
1861 GACCGG7AC" CCGCGGCGCG AACT7C7GCA TG7CGACGA7 CTGGCGAGC- CA7GCv..GTT 
C^SSaI CATTTCGATG G7CCGAACCA CG7CAACG7G GGCACCGGCG 7CGA7CACAG 
1 8 CA^SaG ATCGCAGACA 7GG7CGC7AC GGCGG7GGGC 7ACA7CGGCG AAACACG7TG 
SS^AACr AAACCCGA7G GAACCCCGCG CAAAC7A77G GACG7C7CCG CGCTACGC* 
2101 G7TGGGTTGG CGCGCGCGAA 7CGCACTGAA AGACGGCA7C GA7GCAACGG 7G7CG7GGTA 
SSSS GCCGATGCCG 7GAGGAGG7A AAGC7GCGGG CCGGCCGA7G ^CCCTCC 
2221 GGCCGGACGG G7AGGGCGAC C7GCCA7CGA G7GG7ACGGC AG7CGCC7GG CCGGCGAGww 
2 "l S^^A 7GGGAG7A7C CCA7AGCC7G GC77GGC7CG CCCC7ACGCA 77A7CAG7TG 
3 AC^CG CGCCAGCTCG CAGGCTCGCG GCAGCA7CCC GTTCAGG7C7 CCTCATGGTC 
240 CGG7G7GGCA CGACCACGCA AGCTCGAACC GAC7CG777C CCAATTTCSC A7GC7AA7AT 
24 CG^CGA^GG A77T7TTGCG CAACGCCGGC T7GATGGC7C G7AACG77AG CACCGAGATG 
252* C^CCAC- 7CGAACGAAA GCGCC7A77A G7AAACCAA7 7CAAAGCA7A CGGAG7CAAC 
"l GT^GTTATTG A7G7CGG7GC TAACTCCGGC CAGT7CGG7A GCGCTTTGCG 7CG7GCAGGA 

2 1 SS^AGCC G7A7CG7T7C C7T7GAACC7 C777CGGGGC CA7T7GCGCA AC7AACGCGC 

ri^GGCA7 CGGA7CCACT A7GGGAG7G7 CACCAG7A7G CCC7AGGCGA CGCCGA7GAG 
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2761 ACGATTACCA TCAATGTGGC AGGCAATGCG GCGGCAAGTA GTTCCGTGCT GCCGATGCTT 
2821 AAAAGTCATC AAGATGCCTT TCCTCCCGCG AATTATATTG GCACCGAAGA CGTTGCAATA 
2881 CACCGCCTTG ATTCGGTTGC ATCAGAATTT CTGAACCCTA CCGATGTTAC TTTCCTGAAG 
2941 ATCGACGTAC AGGGT77CGA GAAGCAGGTT ATCGCGGGCA GTAAGTCAAC GCTTAACGAA 
3001 AGCTGCGTCG GCATGCAACT CGAACTTTCT TTTATTCCGT TGTACGAAGG TGACATGCTG 
3061 ATTCATGAAG CGCTTGAACT TGTCTATTCC CTAGGTTTCA GACTGACGGG TTTGTTGCCC 
3121 GGATTTACGG ATCCGCGCAA TGGTCGAATG CTTCAAGCTG ACGGCATTTT CTTCCGTGGG 
3181 GACGATTGAC ATAAATGCTT GCGTCGGCAC CCTGCCGGTA TCCAAACGGG CGATCTGGTG 
3241 AGCCGGCCTC CCGGGCACCT AATCGACTAT CTAAATTGAG GCGGCCGCGA CGTGCGGCAC 
3301 GAACAGGTGG CCGGCTGCTA GCGTTACACA CGTCATGACT GCGCCAGTGT TCTCGATAAT 
3361 TATCCCTACC TTCAATGCAG CGGTGACGCT GCAAGCCTGC CTCGGAAGCA TCGTCGGGCA 
3421 GACCTACCGG GAAGTGGAAG TGGTCCTTGT CGACGGCGGT TCGACCGATC GGACCCTCGA 
3481 CATCGCGAAC AGT77CCGCC CGGAACTCGG CTCGCGACTG GTCGTTCACA GCGGGCCCGA 

3 541 TGATGGCCCC TACGACGCCA TGAACCGCGG CGTCGGCGTA GCCACAGGCG AATGGG7ACT 
3601 TTTTTTAGGC GCCGACGACA CCCTCTACGA ACCAACCACG TTGGCCCAGG TAGCCGCTTT 
3661 TCTCGGCGAC CATGCGGCAA GCCATCTTGT CTATGGCGAT GTTGTGATGC GTTCGACGAA 
3721 AAGCCGGCAT GCCGGACCTT TCGACCTCGA CCGCCTCCTA TTTGAGACGA ATTTGTGCCA 
3791 CCAATCGATC TTTTACCGCC GTGAGCTTTT CGACGGCATC GGCCCTTACA ACCTGCGCTA 
3841 CCGAGTCTGG GCGGACTGGG ACTTCAATAT TCGCTGCTTC TCCAACGCGG CGCTGAT7AC 
3901 CCG CTACATG GACGTCGTGA TTTCCGAATA CAACGACATG ACCGGCTTCA GCATGAGGCA 
3961 GGGGACTGAT AAAGAGTTCA GAAAACGGCT GCCAATGTAC TTCTGGGTTG CAGGGTGGGA 
4021 GACTTGCAGG CGCATGCTGG CGTTTTTGAA AGACAAGGAG AATCGCCGTC TGGCCTTGCG 
4081 TACGCGGTTG ATAAGGGTTA AGGCCGTCTC CAAAGAACGA AGCGCAGAAC CGTAGTCGCG 
4141 GATCCACATT GGACTTCTTT AACGCG7TTG CGTCCTGATC CACCTTTCAA CCCCGTTCCG 
4201 CGTGACGCGG CGCGCAGAGA GTGGTCGCAT ATCGCGTCAC TGTTCTCGTG CCAGTGCTTG 
4261 GAAAGCGTCG AGCACTCTGG TTCGCGTTCT TGACGTTCGC GCCCGCCCCT AGAGGTAGCG 

4 321 TGTCACGTGA CTGAAGCCAA TGAGTGCAAC TCGGCGTCGC GAAAGGT7TC AGTCGCGGTT 
4381 GAGCAAGACA CCGCAAGACT ACTGGAGTGC GTGCACAAGC GCCTCCAGCT CACGG 



Seq. ID No. 5 



x acgaccgccg cgacc-ggtc ggcggtgccg acaggaaccg tcgacttgcc gacgaccacc 

61 tcgtaccggt cgargcacga cccaatgccg cccgcaaccg agaagacgta cgtcaggtcc 

121 gccgccccgc tttcacccac gggcgtcggg acggcgacga aaacgacgtc cgcgtgctcg 

181 attccgcgct gccggccggt ggtgaagtca accagcccgc tcccacggct ccccgcaatc 

241 aactcccaac ccgggcccga aaatcgggac accgcctgcg aggagcaaat cgaccrtggc 
301 ctgatcgata tcgacacaga cgacatcgtt gccgctaccc gcgagacagg cgcccgtgac 
361 gaggcctaca cagcctga 
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1 MIAVIWSAVPTGTVDLSTITLYRSMYDPMS 
31 SATHKTYVRSAAPLSPMGVGTAMKMTSACS 
61 IPRCRSVVKSISP?5RFLAINSQPGLSNRD 
91 TACSSQIDLGLIDIDTDDIVAAIRETGARD 
121 E A Y I A 
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Seq. ID No. 7 

1 gcgccacccg ccccaaccgc gtcggcgaca acgatcccgc cgaacgaccc cgagggaccg 
6! aaaagcaccg cggagagcgc ccgcgcgcag cgccacgggg ggcgaaccga gcacaccgcc 
121 accgacggcg gaccgggcga cgccgccgcg gagcacccgc ccggcgaccc cggccccgca 
191 caccggcaac cccagcccga caacgggaga cacgacgcga cgaaccaggg caccgcccac 
241 ccgccgggcg acccgccgcg gcccacgcac cccacggacc gcccccccga cccagacgca 
301 gccgcccccg eggcggaggc gcccccgggg cacggaccag cacgegaccc gcggggccac 
361 gggaaaaaca accccgccgg acccgacggc aaaccacccc ccccccggcc gcacggccat 
421 acgccgccca agacgcggaa acccccgccc ggcgcgacgg ccgcgcacca ggcgacactc 
481 cccggcgcgc cgccggcagc caagccgggc ggccacgacc ccgaccccgg acccgaggcg 
54 1 gaccagccgc ccacccaccg cgccgcacca acacggcccc ccgtcacgac cgaccgcgcg 
601 gcccgcgacc ccgacgccac gggacccggc ccaacccagc ccacccgcga gcaccaccgg 
661 accctgcggc ggccccggga cccgcacggc gactacccgc cgggtgggcg cagagcgccg 
721 cgggcccacc cgcgcgcgaa ggagcacccg actcgggccg acccggccgc acccaacgcg 
781 gcaaagcccc cgcgagcgaa gcccgccaga gccccgcgga agcaaaaccc acag 
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Seq. ID No. 9 



1 gcgaagcgag cgcccacaac agggaccacg gggcaggacg gcccccaccc cgccgagcca 
61 ccaccgagca agggacacga ggtccacggg cccgcccgcc gagccccgac gcccaacacg 
121 ccgcggaccg atcacctcca cgccgaccca caccaaccgg gcgcgcgccc gttcttgcac 
181 cacgcagacc ccaccgacgg cacccggccg gcgaccccgc ccagcagcac cgacccggac 
241 gaggcccaca accccgcagc gcagccccac gcgcgcgcca gccccgacga gccagcgcac 
301 accggagaca ccaccggcac gggaccgacc cgacccccgg aagcagtccg ccccccccgg 
361 gcggaccgcc ggccccacca ggcccccccg ccggagacgc tcggcgcacc cccgccaccg 
421 cagaacgaac cgacgccgcc ccacccccgc ccgccacacg gcgcggccaa ggcccccccg 
481 caccggacga cccgcaacca ccgagaggcg cacggaccac ccgcagcgaa cggcaccccg 
S41 cccaaccacg agcccccccg gcgcggcgag acccccgcga cccgaaagac cacgcgcgcc 
601 gcggcgcgca cccgagccgg cgcccaaccg gaggcctaca cgggcaaccc cgacgcgacc 
661 cgcgaccggg gctacgcgcc cgaacacgcc gaggggacgc ggaggacgcc gcaagcgccc 
721 gaacccgacg accacgcccc ggcgacaggg cgcggccaca ccgcacgcga gcccgcccaa 
781 gccgcccccg accacgccgg gcccgaccgg caaaagcgcg ccaagcccga cgaccgccac 
841 ccgcgcccca ccgaggccga cccgccagca ggagacgccg acaaggcggc ccagccaccc 
901 ggccggaaag ccccggccca caccggcgaa cccgcgcgca ccacggcgga cgcggacacc 
961 gccgcgccgg agcgcgacgg cacaccacgg accgacacgc cgacgccgcc cggccggggc 
1021 agagcaagcc ga 
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Seq. ID No. 11 



1 gtgaagcgag cgcrcataac agggaccacg gggcaggacg gc-cct:accc cgccgagcca 
61 ccaccgagca agggacacga ggttcacggg ctcgctcgcc gagccccgac gcctaacacg 
121 ccgC ggat:cg accaccccra cgtcgaccca caccaaccgg gcgcgcgcct gtccctgcac 
131 tatgcagacc tcactgacgg cacccggccg gcgaccctgc tcagcagcac cgacccggat 
241 gaggcctaca accccgcagc gcagccccac gcgcgcgtca gctctgacga gccagtgcac 
301 accggagaca ccaccggcat gggaccgacc cgactcctgg aagcagcccg ccctccccgg 
361 gtggaccgcc ggc-czatca ggctccctcg ccggagacgt ccggcgcacc cccgccaccg 
421 cagaacgaac cgacgccgcc ctacccccgt ccgccacacg gcgcggccaa ggccccctcg 
481 taccggacga cccgcaacca tcgagaggcg cacggactac tcgcagtgaa cggcacc^cg 
541 cccaaccacg agcccccccg gcgcggcgag ac^tcgcga cccgaaagac cacgcgcgcc 
601 gtggcgcgca cccgagcugg cgtccaatcg gaggtctata cgggcaacct cgacgcgatc 
661 cgcgaccggg gccacgcgcc cgaacacgcc gaggggacgc ggaggacgtt gcaagcgcc- 
721 gaacctgacg accacgcccc ggcgacaggg cgiggccaca ccgcacgcga gtticgcccaa 
781 gccgcccccg accacgtcgg gcccgaccgg caaaagcacg ccaagttcga cgaccgccac 
841 ttgcgcccca ccgaggccga ctcgctagza ggagatgccg acagggcggc ccagccaccc 
901 ggctggaaag ccticggccca caccggcgaa czcgcgcgca tcatggcgga cgcggacacc 
961 gccgcgccgg agcgcgacgg cacaccacgg accgacacgc cgacgccgcc cggctggggc 
1021 ggagtaagct ga 
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Seq. ID No. 13 

1 gcgcgacggc acaccacgga zcgacacgcc gatgttgcct ggccggggca gagtaagctg 
61 acgactacac ccgggcc^cc ggaccgcgca acgcccgcgc acatcgccgg ccaccggggg 
121 ccggccggcc cagcgcccgc acgtagaccc gaggccgagg ggcccaccaa ccccatcgcg 
181 cgatcacgcg acgagactga cccgacggac cgagccgcaa cgctcgacct cgtgcctgag 
241 acaagaccac aggcgaccat cgatgcggcc gcacgggccg gcggcaccac ggcgaacaac 
301 acccaccccg cggaccccct guccgaaaac ctccgaaccc agaccaacnc gcrcgacgca 
361 gctgtcgccg tgcgcgtgcc gcggcccccc ttccccggtc cgccatgcac ctacccgaag 
421 cacgctccgc aacctaccca cgagagcgcc ccactgaccg gccccttgga gcccaccaac 
481 gacgcgcatg cgaccgccaa gaccgccggc accccgcaag cccaggcggc caggcgccaa 
541 tatgggccgg cgcggacctc cgcgacgccg accaaccccc acggacccgg cgacaacctc 
601 tccccgtccg ggtcgcatcc cctgccggcg ctcacccgcc gacatgagga agccaaagcc 
661 ggcggcgcag aagaggcgac gaatcggggg accggtactc cgcggcgcga acccccgcac 
721 gtcgacgacc tggcgagcgc acgcccgttc ctcccggaac accccgacgg cccgaaccac 
781 gtcaacgcgg gcaccggcgc cgaccacagc actagcgaga ccgcagacac ggccgctaca 
841 gcggtgggcc acatcggcga aacacgctgg gacccaacca aacccgacgg aaccccgcgc 
901 aaactattgg acgtctccgc gccacgcgag ttgggttggc gcccgcgaat cgcactgaaa 
961 gacggcatcg acgcaacgg^ gtcgtggtac cgcacaaacg ccgacgccgt gaggaggcaa 



Seq. ID No. 14 
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DRHADVAWLGQS 
GKRGLVGSALVR 
DLTDRAATFDFV 
MANNTYPADFLS 
PRLLFLGSSCIY 
EPTNDAYAIAKI 
SAMP7NLYGPGD 
EAKAGGAEEVTN 
ACLFLLEHFDGP 
MVATAVGYIGET 
ALRELGWRPRIA 
V R R 



KLTTTPGPLDRA 
RFEAEGFTNLIV 
SETRPQVIIDAA 
SNLRIQTNLLDA 
PKYAPQPIHESA 
AGILQVQAVRRQ 
NFSPSGSHLLPA 
WGTGTPRRELLH 
NHVNVGTGVDHS 
RWDPTKPDGTPR 
LKDG IDATVSWY 
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Seq. ID No. 15 

L gcgcganggc acaccatgga 
SI acgacxacac ccgggccccc 
121 ctggtcggcc cagcgcrcgc 
5 181 cgaccacgcg acgagaccga 

241 acaagaccac aggtgatcac 
301 acccaccccg cggacttccc 
361 gccgtcgccg cgcgcgcgcc 
421 tacgcxccgc aacccaccca 

10 4 81 gacgcgtacg cgatcgccaa 

S4l tacgggccgg cgcggatccc 
601 tccccgcccg ggccgcacct 
661 ggcggcgcag aagaggtgac 
721 gtcgacgacc cggcgagcgc 

15 791 gccaacgcgg gcaccggcgt 

841 gcggtgggct acaccggcga 
901 aaactattgg acgtccccgc 
961 gacggcaccg acgcaacggc 



ccgacacgcc gacgctgccc ggccggggcg gagcaagctg 
ggaccgcgca acgcccgcgc acaccgccgg ccaccggggg 
acgcagactc gaggccgagg ggctcaccaa tctcaccgtg 
cccgacggac cgagccgcaa cgcccgacct cgtgtctgag 
cgacgcggcc gcacgggccg gcggcaccac ggcgaataac 
gcccgaaaac ctccgaaccc agaccaatct gcccgacgca 
gcggcccctc ctcctcggtc cgtcacgcat ctacccgaag 
cgagagcgct ccatcgaccg gcccttcgga gcccaccaac 
gaccgccggt accctgcaag ttcaggcggc caggcgccaa 
cgcgacgccg actaaccccc acggacccgg cgacaacttc 
cctgccggcg ctcacccgcc gacatgagga agccaaagcc 
gaatcggggg accggtactc cgcggcgcga actcctgcac 
atgcctgtcc ccttcggaac actccgacgg cccgaaccac 
cgatcacagc actagcgaga tcgcagacat ggccgccacg 
aacacgctgg gacccaacta aacccgatgg aaccccgcgc 
gctacgcgag ctgggttggc gcccgcgaac cgcaccgaaa 
gtcgtggtac cgcacaaacg ccgatgccgc gaggaggtaa 



Seq. ID No. 16 
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HTMDRHADVAWLGR 
YIAGHRGLVGSALV 
D S IDLTDRAATFDF 
GGIMANNTYPADFL 
VRVPRLLFLGSSCI 

gpleptndayaiak 
awisamptnlygpg 
ryeeakaggaeevt 

LASACLFLLEHFDG 
IADMVATAVGYIGE 
DVSALRELGWRPRI 
A D A V R R 



SKLTTTPGPLDRA 
RRFEAEGFTNLIV 
VSETRPQVI IDAA 
SSNLRIQTNLLDA 
YPKYAPQPIHESA 
lAGIjQVQAVRRQ 
DNFSPSGSHLLPA 
MWGTGTPRRELLH 
PNHVNVGTGVDHS 
TRWDPTKPDGTPR 
ALKDG IDATVSWY 



Seq. ID No. 17 

1 acggattmt cgcgcaacgc 
35 61 cacttcgaac gaaagcgccc 

121 atcgatgtcg gtgccaactc 
181 agccgcaccg tttcctttga 
241 gcaccggatc cactatggga 
301 accaccaacg cggcaggcaa 
40 361 caccaagacg cctcccctrcc 

421 cntgattcgg tcgcatcaga 
481 gcacagggct ccgagaagca 
541 gtcggcacgc aacccgaacc 
601 gaagcgcccg aactcgccra 
45 6 61 acggacccgc gcaatggccg 

721 cga 



cggcttgacg gcccgtaacg ccagcaccga gatgccgcgc 
attagtaaac caatccaaag catacggagt caacgttgtt 
cggccagtrc ggcagcgcct tgcgccgtgc aggattcaag 
acctcttccg gggccacccg cgcaaccaac gcgcaagtcg 
gcgtcaccag tatgccccag gcgacgccga cgagacgact 
egcgggggca agtagtcccg cgccgccgac gctcaaaagt 
cgcgaactat accggcaccg aagacgccgc aacacaccgc 
attcctgaac cctaccgacg ttaccttcct: gaagaccgac 
ggttaccacg ggcagtaagr caacgcccaa cgaaagctgc 
ttctcctatc ccgccgcacg aaggtgacat gccgacccac 
c;:ccctaggc cccagaccga cgggcccgcz gcccggcctt 
aacgccccaa gccgacggca tcctcttccg cggggacgat 
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Seq. ID No. 18 
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VSTEMLRHFERKRLLVN 
GANSGQFGSALRRAGFK 
AQLTRKSASDPLWECHQ 
VAGNAGASSSVLPMLKS 
EDVAIHRLDSVASEFLN 
FEKQVITGSrCSTLNESC 
EGDMLIHSALELVYSLG 
RNGRMLQADGI FFRGDD 



Seq. ID No. 19 

1 atggatctrt cgcgcaacgc cggcttgacg gctcgtaacg ttagcaccga gacgctgcgc 

61 cacttcgaac gaaagcgccc actagcaaac caactcaaag cacacggagt caacgccgcc 

121 atcgatgccg gcgctaactc cggccagtcc ggtagcgctx cgcgtcgtgc aggatccaag 

19 1 agccgtaccg cccccctcga acctctttcg gggccacrtg cgcaactaac gcgcgagccg 

241 gcatcggacc caccacggga gtgtcaccag tacgccccag gcgacgccga cgagacgact: 

301 accaccaacg cggcaggcaa cgcgggggca agcagtcccg cgcrgccgat gcccaaaagt 

361 catcaagatg cctctcctcc cgcgaatcat attggcaccg aagacgctgc aacacaccgc 

421 cstgacccgg ccgcaccaga acctccgaac cctaccgacg ttacttccct gaagaccgac 

481 gtacagggct ccgagaagca ggtcaccgcg ggcagtaagt caacgctcaa cgaaagctgc 

541 gccggcatgc aactcgaact ctctcctacc ccgttgcacg aaggcgacac gccgattcat 

601 gaagcgcnTig aacctgtcca tcccccaggt cccagaccga cgggtccgcc gcccggaccc 

661 acggatccgc gcaacggccg aacgcticcaa gctgacggca ccctcccccg cggggacgac 
721 cga 



Seq. ID No. 20 
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AGLMARNVSTEM 
VNVVIDVGANSG 
SPLSGPFAQLTR 
DETITINVAGNA 
PANYIGTEDVAI 
LKIDVQGF2KQV 
LSFIPLYEGDML 
LPGFTDPRNGRM 



LRHPSRKRLLVH 
QFGSALRRAGFK 
ESASDPLWSCHQ 
GASSSVLPMLKS 
HRLDSVASSFLN 
IAGSKSTLNESC 
IHSALSLVYSLG 
LQADGIFFRGDD 



-50- 



VCT/C&CfZ3221 



Seq. ID No. 21 
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acgaccgcgc 


cagcgccccc 
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gaagcaccgc 
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ccgaccggac 
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cgaccggccg 


cccacagcgg 
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caggcgaacg 
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accacgccgg 


cccaggcagc 
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ggcgacgccg 


cgacgcgccc 
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cccccacccg 


agacgaaccc 
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ggcaccggcc 


cccacaaccc 
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cgccccccca 


acccggcgcc 
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gacacgaccg 


gccccagcac 




661 


acgcaccccc 


gggccgcagg 
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aaggagaacc 


gccgcccggc 


15 


781 


gaacgaagcg cagaaccgta 



gacaaccacc cccaccccca acgcagcggc gacgccgcaa 
cgggcagacc caccgggaag cggaagcggc ccccgccgac 
ccccgacacc gcgaacagcc cccgcccgga acccggcccg 
gcccgacgac ggcccccacg acgccacgaa ccgcggcgtc 
ggcacccccc ccaggcgccg acgacaccct ccacgaacca 
cgcccccccc ggcgaccacg cggcaagcca ccccgcccac 
gacgaaaagc cggcatgccg gaccccccga ccccgaccgc 
gtgccaccaa ccgacccccc accgccgcga gccccccgac 
gcgccaccga gcccgggcgg accgggaccc caacacccgc 
gaccacccgc cacacggacg ccgegacccc cgaacacaac 
gaggcagggg accgataaag agcccagaaa acggccgcca 
gcgggagact cgcaggcgca tgctggcgtc tctgaaagac 
cttgcgcacg cggctgacaa gggtcaaggc cgcctccaaa 

g 



Seq. ID No. 22 
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Seq. ID No. 23 

1 acgaccgcgc cagtgctccc 
61 gcctgccrcg gaagcaccgc 
121 ggcggcccga ccgaccggac 

30 181 cgaccggccg cccacagcgg 

241 ggcgcagcca caggcgaacg 
301 accacgccgg cccaggcagc 
361 ggcgacgccg cgacgcgctc 
421 cccccacccg agacgaaccc 

35 481 ggcaccggcc cccacaaccc 

541 cgccccccca acccggcgcc 
601 gacacgaccg gccccagcac 
661 acgcaccccc gggccgcagg 
721 aaggagaacc gccgcccggc 

40 781 gaacgaagcg cagaaccgca 



gacaaccacc cccaccccca acgcagcggc gacgccgcaa 
cgggcagacc caccgggaag cggaagcggc ccccgccgac 
ccccgacacc gcgaacagcc cccgcccgga acccggcccg 
gcccgacgac ggcccccacg acgccacgaa ccgcggcgcc 
ggcacccccc ccaggcgccg acgacacccc ccacgaacca 
cgcccccccc ggcgaccacg cggcaagcca ccccgcccac 
gacgaaaagc cggcacgccg gaccccccga ccccgaccgc 
gcgccaccaa ccgacccccc accgccgcga gccccccgac 
gcgccaccga gcccgggcgg accgggaccc caacacccgc 
gaccacccgc cacacggacg ccgcgacccc cgaacacaac 
gaggcagggg accgacaaag agcccagaaa acggccgcca 
gcgggagacc cgcaggcgca cgccggcgcc cccgaaagac 
cccgcgcacg cggccgacaa gggccaaggc cgcccccaaa 

9 
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Seq. ID No. 24 
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151 


S 


I 


F 


Y 


R 


R 


E 
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211 
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D 


K 
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241 


EC 
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N 
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R 


L 


A 



I I I PTFNAAVTL 
LVDGGSTDRTLD 
PDDGPYDAMNRG 
YEPTTLAQVAAF 
TKSRHAGPFDLD 
LFDGIGPYNLRY 
ITRYMDVVISEY 
RLPMYFWVAGWE 
LRTRLIRVKAVS 



QACLGS IVGQT 
IAMSFRPELGS 
VGVATGEWVLF 
LGDHAASHLVY 
RLLFETNoCHQ 
RVWADWOFNIR 
NOMTGFSMRQG 
TCRRMLAFLKD 
K E R S A E P 



Seq. ID No. 25 



1 gtggccagca gaagccccca ccccgctgcg ggtggttggc taattcccgg cggctcccrc 
61 ctngtggccg gcgcggcgca tccggtagga cccgccggag gcgacgacga tgccggcgtg 
121 gcgcagcagc cgatcgagga tgctggcggc ggcggtgcgc ccgggcagga accgccccca 
131 ctgtccgaag ggccaacgcg aggcgacggc cagggagcgg cgcccgcagc cggcagccac 
241 gagccggaac aacagtcgag tcccggtgnc gtcgagcggg gcgaagccga ccccgcccaa 
301 gatgaccaga tccgcgcgga gcagggcgcc gacgacctrg ccgacggtgc cgccggccag 
361 gccgcgsca^ aggaccce^acaggccggc ggcggcgaag cagcggactt tgaatccggc 
421 gcggacgaca gcgcgcccgc agccgacgag caggcgaccc tcgcccgcac caggcgggcc 
481 aatrgaccgcc aggtcccgtc gcgcccgaat ccactccagg cccgacaggc agtcgaacgc 
541 ggccgcggtg atcgacgatc cggtgacgtc gaacccg-cg agggceecgg cgaccgggaa 
601 ggccgcggcc ccgagacggc tggcggtgcc ggaggcatcg cgggcagcga ccccggcccc 
661 aaccaacgtc cgcaggacct cctccggLgt ccagcgtcgc gccccggcga cccgcaacac 
721 cccggcggcg ttgcggcgca ccgtggccag cttcaaccgc cgcagccccg cgtcaaggtc 
781 agcagccagc ggtgccgccg aggacggcgc caccggcitg gcagcggcgg tcatgaggcc 
841 gtcccgtcgg cggcgccgan ctcgcag 



Seq. IDUteSK--... 

1 VASRSPHSAAGG 

31 LAGGDDDAGVVQ 

61 LFEGPMRGDGQG 

91 VERGEADLVQDD 

121 AAVEDLDQVGGG 

151 QVTFARTRWAND 

181 GCGDRRSGDVEP 

211 GGIAGSDLGLNQ 

241 LGGVAAHRGQLQ 

271 HRLGSGGHEAVP 



WLILGGSLLVVGVAHPVG 
QPIEOAGGGGVLGQSS PP 
AALVAGSHSPEQQLSPGV 
Q IRAEQGVDDLADGVVGQ 
EVADFESGVDGSVPAADE 
RQVLLCPNPFQARQVVER 
VEGLGDREGCGLETVGGV 
RPQOLLRCPALRLGDLQH 
PPQRRVKVSSQRCRRGRC 

S V V ti I L 
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Seq. tD No. 27 

1 acgggccgcc ccaaaggcgg cgtcgccgcc aacgccgc-g ccccaacacc ggaccatgtg 
61 cgactcgcgc cccaccacgg ctccgtcccg gacctczgcc acggcgcgga cccgcaaccg 
121 aagggcaccg cggagaaccc ctgtggccac gcccaggacg accczgcggc gccgccgctg 
181 accgaagccg cgccagccgg cgagcaggcc gacctacgcg cccccaacgc ccaggcgcaa 
241 ctacggtgcg ccgaggccaa tgccacggcc cactcggaga cctgcgccgc gcccaacgac 
301 cgcctggccg acgagcgcac cgtcccgagg gagccgcccc cgccgcggcc gacgaccggc 
361 ccggggccgg cgcgccgcaa ggccgacggc ccctcgcgca tccgctacgg cccagctcgt 
421 cacccggcgc cccagcggct cgccggcgcc accgtggcgg cggcggccga ccacggcgcc 
481 ccgaccccgc cggaacctgc gaccggcgtg atcgtggccg agcacgagcc cgccagccca 
541 ggtgaggcgc ccaccctcga tgaacactac gacggaccca gacccgcacc ctcgcgtggt 
601 ccccgcccga aaacccaagc agagaaacga ctctgcgcac tgggaaccga agcgcagcag 
661 tcccccgccg gcgccgctgc gaccggcaac acccgaccga aacccgaacc cgacaccctg 
721 cccggccccg gcgccgccca cggcgaacag gcttcgactg acgcgctgcg ccgggcggcc 
7B1 gcgtttcgcc ggctccgcgc tgccgacgcg cgcccgaccc cggccgccgg cgccggcacc 
841 ccacaacccc gccccgccgg cgacgcaccc gcgcccgacc cgcccaccgt cgagacccgc 
901 tcgttggagg cccacaagac caacaccacc gacgggacgg ccccacgacc accgccgcca 
961 agccggcggc accgccctcg gcggcaccgc tggctgccga cctcgacgcg gcgc^gcggc 
1021 ggtcgaagct ggccacggtg cgccgcaacg ccgccgaggc gttgcaagtc gccaagacgc 
1081 aacgctggac accggaggag accctgcgga cgccggccga ggccgagatc gccgcccgcg 
1141 atgcccccaa caccgccaac cgtcucaagg ccgcagcccc cccggccacc aagacccccg 
1201 acgggtccga cgccaccgga ccgtcgatca ccgcagccac gcccgactac ccgtcgagcc 
1261 cggaatggac ccgggcacaa cagaacccgg cggccaccgg cccacctggc acgggcaaaa 
1321 gtcacccgcn caccggcrgc gggcacgccg ccgtccacgc cggatccaaa gcccgccact 
1381 tcaccgccgc cgacctgatc gaggccctct: accgcggccr ggccgacaac accgtcggca 
1441 agaccaccga caccccgcrc cgcgcggacc cggtcaccrc ggacgagacc ggcticcgccc 
1S01 cgcccgacga caccgggacc caactgccgt tccggctcgc ggccgccggc cacgagcgcc 
1561 gctccccggc caccgccrcg cactggcccc tcgaacaacg ggggcgaccc ccgcccgagc 
1521 acaccaccgc cgccagcatc ctcgatcggc cgctgcacca cgccagcacc gtcgccacct 
1681 ccggcgagcc ctaccggacg cgccacgccg accacaagaa gggagccgcc aagaaccag 



Seq. ID 


No. 28 
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Seq. ID No. 29 

1 MTTAAKPVAPSSAAPLAADLDAALRRLKLA 
31 TVRRNAAEVLQVAKTQRWTPEEILRTLVEA 
61 EIAARDASNTAMRLKAAAFPVTKTLDGrDV 
91 TGSSITAATFDYLSSLEWIRAQQMLAVIGP 
121 PGTGKSHLLIGCGHAAVHAGFKVRYFTAAD 
1S1 LISVLYRGLADNTVGKIIDTLLRADLVILD 
181 EIGFAPLDDTGTQLLFRLVAAGYERRSLAI 
211 ASHWP FSQWGRFLPEHTTAAS I LDRLLHHA 
241 SIVVTSGESYRMRHADHKKGAAKN 



Seq. ID No. 30 

1 gcgacgcctg ccccgaccgc ctcggtgata acgacctcgc ccaacgacct: cgacgggttg 
61 cagcgcacgg tgaaaagcgc gcgggcgcaa cgctaccggg gacgcaccga gcacatcgca 
121 accgacggcg gcagcggcga cgacgtggcg gcacacctgi: ccgggcgcga accaggcttc 
181 gcgcactggc agcccgagcc cgacggcggg cggcacgacg cgacgaacca gggcatcgcg 
241 cacgcaccgg gtgacccgcc gtggcccccg caccccgccg accgcrctcc cgggcccgac 
301 gtggcag ccc aggccgcgga ggcgccaccc ggcaagggac cggcgticcga actgcggggc 
361 ctcgggatgg accgcctcgc cgggcccgat cgggtgcgcg gcccgacacc cctcagcccg 
421 cgcaaatccc cggccggcaa gcaggtrgcc ccgcaccaag caccgcccrr cggatcatcg 
481 ccggcggcca agatcggngg ccacgacctc gacctcggga ccgccgccga ccaggaactc 
541 atattgcggg ccgcgccggt acgcgagccg gccacgac-c ggtgcgtgcc gtgcgagccc 
601 gacaccacgg gcgncggccc gcaccgggaa ccaagcgcgg tctccggtga tctgcgccgc 
661 atgggcgacc ttcaccgccg ctiacccgccc gggggaaggc gaacaccaca cgcccaccca 
721 cgcggccggg agccccacgc ccacaacagc cgattctggg aaaacgtctit: cacgcgaacg 
781 tcgaaacag 



Seq. ID No. 31 
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FNDLDGLQRTVKSVRAQ 
GSGDDVVAYLSGCEPGF 
AMNQGIAHASGDLLWFL 
QAVEALSGKGPVSELWG 
GPIPFSLRKFLAGXQVV 
KIGGYDLDFGIAADQEF 
RCVLCEFDTTGVGSHRE 
LHRRYPFGGRRISHAYL 
ENVFTRMSK 
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Seq. ID No. 32 

L gcgaagcgag cgcncaccac cggaaccacc ggccaggacg gcccgtaccc cgccgaaccg 
61 ccgccggcca aggggtacga ggctcacggg ctcatccggc gcgccccgac gtccaacacc 
121 tcgcggaccg accaccccca cgccgacccg caccaaccgg gcgcgcggcc gcctctcrcsc 
181 cacggcgacc cgaccgacgg aacccggccg gcgaccctgc cgagcaccac cgaacccgac 
241 gaggcgcaca acctggcggc gcagccacac gtgcgggcga gcttcgacga acccgtgcac 
301 accggtgaca ccaccggcac gggacccacg cgactgccgg aagccgctcg gctcccccgg 
361 gcgcaccgcc gcttctatca ggcgccctcg tcggagatgt ccggcgcccc gccgccaccg 
421 cagaacgagc cgacgccgcc ctacccgcgg ccaccgcacg gcgccgccaa ggtctactcg 
481 tactgggcga cccgcaacca ccgcgaagcg tacggatcgt ccgccgctaa cggcacctcg 
541 ttcaatcacg aaccaccgcg gcgcggcgag acgttcgcga cccgaaagac caccagggcc 
601 gtggcacgca tcaaggccgg tatccagtcc gaggtccaca cgggcaaccc ggatgcggtc 
661 cgcgaccggg ggcacgcgcc cgaacacgtc gaaggcacgc ggcggacgcc gcagaccgac 
721 gagcccgacg acctcgtttt ggcgaccggg cgcggtttca ccgcgcgcga gctcgcgcgg 
781 gccgcgcccg agcatgccgg tttggaccgg cagcagcacg tgaaactcga ccaacgctac 
841 ctgcggccca ccgaggcgga tccgctgacc ggcgacgcga ccaaggccgc cgaatcgccg 
901 ggctggaggg ctccggtgca cactgacgag ctggcccgga ccacggccga cgcgga'catg 
961 gcggcgccgg agtgcgaagg caagccgugg atcgacaagc cgatgaccgc cggccggaca 
1021 cga 



Seq. ID 
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Seq. 10 No. 34 



1 acgaggccgg cccgtcgcgc tcggaacatc ctgcgccgca acggcatcga ggtgtcgcgc 
61 cacctcgccg aaccggaccg ggaacgcaac ttctcgcgcc aaccgcaacc gcaccgggtc 
121 agcgccgcgc ccgacgtcgg ggccaacccg gggcagcacg ccaggggtcc gcgcggcgcg 
131 ggccccgcgg gccgcaccg: ctcgtccgag ccgccgcccg ggccccccgc cgccttgcag 
241 cgcagcgccc ccacggaccc gctgtgggaa cgccggcgct gcgcgccggg cgacgccgac 
301 ggaaccaccr cgaccaacgc cgccggcaac gagggcgcca gcagtcccgc cccgccgsc?: 
361 ttgaaacgac accaggacgc etccccacca gccaaccacg cgggcgccca acgggcgccg 
421 acacaccgac ccgactccgc ggccgcagac gcccrgcggc ccaacgaeac cgcgctcctg 
481 aagaccgacg cccaaggac: cgagaagcag gcgaccgcgg gcggcgactc aacggcgcac 
S41 gaccgacgcg ccggcacgca gcccgagccg cctccccagc cgccgcacga gggcggcacg 
601 cccacccgcg aggcgcccga ccccgcgga: ccgccgggcc tcacgccctc gggaccgcaa 
661 cccggtccca ccgacccccg caacggccga atgc=gcagg ccgacggcac ccccrcccgg 
121 ggcagcgacc ga 
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Seq. ID No. 35 
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RNILRRMGIEV 
HRVSAVLDVGA 
SFS?U?GPFAV 
OVDGTISIMVA 
FPPANYVGAQR 
AFLKIDVQGFE 
LELSFQPLYEG 
GLQPGFTDPRN 



SRYFAELDWERN 
NSGQYARGLRGA 
LQRSASTDPLWE 
GNEGASSSVLPM 
VPIHRLDSVAAD 
KQVIAGGDSTVH 
GMLIREALDLVD 
GRMLQADGIFFR 



Seq. ID No. 36 



1 gtgaaatcgt tgaaacccgc ccgttcacc gcgcgcagcg ccgccttcga ggtttcgcgc 
61 cgctactctg agcgagaccc gaagcaccag cttgtgaagc aactcaaacc gcgtcgggta 
121 gacgccgcct ccgacgccgg cgccaacca ggacaacacg -ccgccggcc. ccgccgagca 

gca-caagg gccgcaccgt ctcgtccgaa ccgctacccg gaccgc.tac gacctcggaa 

Igcaaagcgc caacggaccc actttgggac cgccggcagc acgcgcggg cgacnccgat 
301 ggaacggcca cgatcaacat cgcaggaaac gccgg.caga gcagttccgt ctcgcccacg 

Sgaaaagnc a.cagaacgc ««cccccg gcaaaccacg ccggcaccca agaggcg.cc 
421 aclaccgac ccgac.ccg, ggcgccagaa tctctaggca cgaacggtgt cgccccte*. 
481 aagg.cgacg cccaaggctc ngaaaagcag gcgctcgccg ggggcaaacc aaccacagac 
541 gacLcgca tcgacatgca actcgaactg cectccccge cg^gtacga aggcggcatg 

ctca" *U aag^cctcga cctcg.gcac tcc.cgggcc ccacgtcgac gggaccgccg 
Z cc";;:La ccga.gcaaa -eggcega acgc.gcagg ccgacggcat c«t«ccge 
721 gaggacgacc ga 
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NO. 37 
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SAAFEVSRRYSERDLKHQ 
VFDFTVGANSGQYAAGLR 
PLSGPFTILESKASTDPL 
GTVTINIAGNAGQSSSVL 
ANYVGTQEASIHRLDSVA 
XVDVQGFSKQVU5GKST 
SFLPLYEGGMLIPEALDL 
PCFIDANNGRMLQADGIF 
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Seq. ID No. 38 



1 acggcgcaga cgaaacgaca cgccggcctg accgcagcta acacaaagaa agccgccacg 
Si gccgcaccaa cgttctcgac caccaccccc accttgaacg cggccgcggc actgcccgcc 
121 cgccccgaca gcaccgcccg ccagacccgc ggcgaccccg agccggcact ggccgacggc 
181 ggctcgacgg acgaaaccct cgacaccgcc aacactctcg cccccaacct cggcgagcgg 
241 ttgaccaccc accgcgacac cgaccagggc gcctacgacg ccacgaaccg cggcgrggac 
301 ccggccaccg gaacgcggcc gctccctccg ggcgcggacg acagcctgca cgaggccgac 
361 accccggcgc gggtggccgc cttcaccggc gaacacgagc ccagcgatct: ggtatatggc 
421 gacgcgacca cgcgcccaac caatccccgc tggggtggcg ccttcgacct cgaccgrccg 
481 ccgcccaagc gcaacacccg ccaccaggcg atcttctacc gccgcggacr ctccggcacc 
541 accggtccct acaacccccg ccaccgggtc ctggccgacc gggactccaa tactcgctgc 
601 cctcccaacc cagcgctcgc cacccgctac acgcacgcgg ccgtcgcaag ccacaacgaa 
661 tccggcgggc ccagcaacac gatcgccgac aaggagtttt tgaagcggct gccgatgtcc 
721 acgagacccg gcacaaggcc ggccacagct ctggtgcgca ggtggccaaa ggtgatcagc 
781 agggccatgg taatgcgcac cgtcattcct tggcggcgcc gacgtcag 
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Seq 40 : 

GATGCCGTGAGGAGGTAAAGCTGC 
Seq 41: 



ANTKKVAMAAPMFSIIIP 
DSIARQTCGDFELVLVDG 
FAPNLGERLIIHRDTDQG 
TGTWLLFLGADDSLYEAD 
EPSDLVYGDVIMRSTNFR 
KRNICHQAI FYRRGLFGT 
DWDFNIRCFSNPALVTRY 
GLSNTIVDKSFLKRLPMS 
RRWPKVISRAMVMRTVIS 



GATACGGCTCTTGAATCCTGCACG 



